On Mon, Sep 26, 2011 at 06:59:33PM -0500, Scott Wood wrote: > On 09/26/2011 01:34 PM, Alex Williamson wrote: > > The other obvious possibility is a pure ioctl interface. To match what > > this proposal is trying to describe, plus the runtime interfaces, we'd > > need something like: > > > > /* :0 - PCI devices, :1 - Devices path device, 63:2 - reserved */ > > #define VFIO_DEVICE_GET_FLAGS _IOR(, , u64) > > > > > > /* Return number of mmio/iop/config regions. > > * For PCI this is always 8 (BAR0-5 + ROM + Config) */ > > #define VFIO_DEVICE_GET_NUM_REGIONS _IOR(, , int) > > How do you handle BARs that a particular device doesn't use? Zero-length? > > > /* Return the device tree path for type/index into the user > > * allocated buffer */ > > struct dtpath { > > u32 type; (0 = region, 1 = IRQ) > > u32 index; > > u32 buf_len; > > char *buf; > > }; > > #define VFIO_DEVICE_GET_DTPATH _IOWR(, , struct dtpath) > > So now the user needs to guess a buffer length in advance... and what > happens if it's too small? > > > /* Reset the device */ > > #define VFIO_DEVICE_RESET _IO(, ,) > > What generic way do we have to do this? We should probably have a way > to determine whether it's possible, without actually asking to do it. That's a good point. PCI devices have a standardized reset, but embedded devices often won't. Mind you we could just fail the call in that cse. > > /* PCI MSI setup, arg[0] = #, arg[1-n] = eventfds */ > > #define VFIO_DEVICE_PCI_SET_MSI_EVENTFDS _IOW(, , int) > > #define VFIO_DEVICE_PCI_SET_MSIX_EVENTFDS _IOW(, , int) > > > > Hope that covers it. > > It could be done this way, but I predict that the code (both kernel and > user side) will be larger. Maybe not much more complex, but more > boilerplate. > > How will you manage extensions to the interface? With the table it's > simple, you see a new (sub)record type and you either understand it or > you skip it. With ioctls you need to call every information-gathering > ioctl you know and care about (or are told is present via some feature > advertisement), and see if there's anything there. No.. quite the opposite. With ioctl()s you call the ones your userspace program cares about / can implement. When an extended interface is added, they keep working as is. Newer userspace which uses the new features will call the new ioctls() if it cares about them. > > Something I prefer about this interface is that > > everything can easily be generated on the fly, whereas reading out a > > table from the device means we really need to have that table somewhere > > in kernel memory to easily support reading random offsets. Thoughts? > > The table should not be particularly large, and you'll need to keep the > information around in some form regardless. Maybe in the PCI case you > could produce it dynamically (though I probably wouldn't), but it really > wouldn't make sense in the device tree case. > > You also lose the ability to easily have a human look at the hexdump for > debugging; you'll need a special "lsvfio" tool. You might want one > anyway to pretty-print the info, but with ioctls it's mandatory. > > -Scott > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html