On Fri, 2011-09-30 at 18:46 +1000, David Gibson wrote: > On Mon, Sep 26, 2011 at 12:34:52PM -0600, Alex Williamson wrote: > > On Mon, 2011-09-26 at 12:04 +0200, Alexander Graf wrote: > > > Am 26.09.2011 um 09:51 schrieb David Gibson <david@xxxxxxxxxxxxxxxxxxxxx>: > [snip] > > > Also, if you can come up with an interface that does not have variable > > > length descriptors but is still able to export all the required > > > generic information, please send a proposal to the list :) > > > > > > > Hi, > > > > The other obvious possibility is a pure ioctl interface. To match what > > this proposal is trying to describe, plus the runtime interfaces, we'd > > need something like: > > Right, this also seems a reasonable possibility to me, depending on > the details. > > > /* :0 - PCI devices, :1 - Devices path device, 63:2 - reserved */ > > #define VFIO_DEVICE_GET_FLAGS _IOR(, , u64) > > > > > > /* Return number of mmio/iop/config regions. > > * For PCI this is always 8 (BAR0-5 + ROM + Config) */ > > #define VFIO_DEVICE_GET_NUM_REGIONS _IOR(, , int) > > > > /* Return length for region index (may be zero) */ > > #define VFIO_DEVICE_GET_REGION_LEN _IOWR(, , u64) > > > > /* Return flags for region index > > * :0 - mmap'able, :1 - read-only, 63:2 - reserved */ > > #define VFIO_DEVICE_GET_REGION_FLAGS _IOR(, , u64) > > > > /* Return file offset for region index */ > > #define VFIO_DEVICE_GET_REGION_OFFSET _IOWR(, , u64) > > The above 3 can be be folded into one "getregioninfo" call. Yep, and the phys addr one below. We can use a flags bit to indicate whether it's valid. > > /* Return physical address for region index - not implemented for PCI */ > > #define VFIO_DEVICE_GET_REGION_PHYS_ADDR _IOWR(, , u64) > > > > > > > > /* Return number of IRQs (Not including MSI/MSI-X for PCI) */ > > #define VFIO_DEVICE_GET_NUM_IRQ _IOR(, , int) > > > > /* Set IRQ eventfd for IRQ index, arg[0] = index, arg[1] = fd */ > > #define VFIO_DEVICE_SET_IRQ_EVENTFD _IOW(, , int) > > > > /* Unmask IRQ index */ > > #define VFIO_DEVICE_UNMASK_IRQ _IOW(, , int) > > > > /* Set unmask eventfd for index, arg[0] = index, arg[1] = fd */ > > #define VFIO_DEVICE_SET_UNMASK_IRQ_EVENTFD _IOW(, , int) > > > > > > /* Return the device tree path for type/index into the user > > * allocated buffer */ > > struct dtpath { > > u32 type; (0 = region, 1 = IRQ) > > u32 index; > > u32 buf_len; > > char *buf; > > }; > > #define VFIO_DEVICE_GET_DTPATH _IOWR(, , struct dtpath) > > > > /* Return the device tree index for type/index */ > > struct dtindex { > > u32 type; (0 = region, 1 = IRQ) > > u32 index; > > u32 prop_type; > > u32 prop_index; > > }; > > #define VFIO_DEVICE_GET_DTINDEX _IOWR(, , struct dtindex) > > I think those need some work, but that doesn't impinge on the core > semantics. > > > /* Reset the device */ > > #define VFIO_DEVICE_RESET _IO(, ,) > > > > > > /* PCI MSI setup, arg[0] = #, arg[1-n] = eventfds */ > > #define VFIO_DEVICE_PCI_SET_MSI_EVENTFDS _IOW(, , int) > > #define VFIO_DEVICE_PCI_SET_MSIX_EVENTFDS _IOW(, , int) > > Why does this need seperate controls, rather than just treating MSIs > as interrupts beyond the first for PCI devices? Well, we could say that PCI will always report 3 for VFIO_DEVICE_GET_NUM_IRQ where 0 = legacy, 1 = MSI, 2 = MSI-X. ioctls on unimplemented IRQs will fail, UNMASK* ioctls on non-level triggered interrupts will fail, and the parameter to SET_IRQ_EVENTFD becomes arg[0] = index, arg[1] = count, arg[2-n] = fd. Maybe we'd then have a GET_IRQ_INFO that takes something like: struct vfio_irq_info { int index; unsigned int count; u64 flags; #define VFIO_IRQ_INFO_FLAGS_LEVEL (1 << 0) }; count would be 0 on PCI if the type of interrupt isn't supported. Better? Thanks, Alex > > Hope that covers it. Something I prefer about this interface is that > > everything can easily be generated on the fly, whereas reading out a > > table from the device means we really need to have that table somewhere > > in kernel memory to easily support reading random offsets. > > Thoughts? > > I certainly prefer it to the previous proposal. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html