Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2011-09-30 at 10:37 -0600, Alex Williamson wrote:
> On Fri, 2011-09-30 at 18:46 +1000, David Gibson wrote:
> > On Mon, Sep 26, 2011 at 12:34:52PM -0600, Alex Williamson wrote:
> > > On Mon, 2011-09-26 at 12:04 +0200, Alexander Graf wrote:
> > > > Am 26.09.2011 um 09:51 schrieb David Gibson <david@xxxxxxxxxxxxxxxxxxxxx>:
> > [snip]
> > > > Also, if you can come up with an interface that does not have variable
> > > > length descriptors but is still able to export all the required
> > > > generic information, please send a proposal to the list :)
> > > > 
> > > 
> > > Hi,
> > > 
> > > The other obvious possibility is a pure ioctl interface.  To match what
> > > this proposal is trying to describe, plus the runtime interfaces, we'd
> > > need something like:
> > 
> > Right, this also seems a reasonable possibility to me, depending on
> > the details.
> > 
> > > /* :0 - PCI devices, :1 - Devices path device, 63:2 - reserved */
> > > #define VFIO_DEVICE_GET_FLAGS			_IOR(, , u64)
> > > 
> > > 
> > > /* Return number of mmio/iop/config regions.
> > >  * For PCI this is always 8 (BAR0-5 + ROM + Config) */
> > > #define VFIO_DEVICE_GET_NUM_REGIONS		_IOR(, , int)
> > > 
> > > /* Return length for region index (may be zero) */
> > > #define VFIO_DEVICE_GET_REGION_LEN		_IOWR(, , u64)
> > > 
> > > /* Return flags for region index
> > >  * :0 - mmap'able, :1 - read-only, 63:2 - reserved */
> > > #define VFIO_DEVICE_GET_REGION_FLAGS		_IOR(, , u64)
> > > 
> > > /* Return file offset for region index */
> > > #define VFIO_DEVICE_GET_REGION_OFFSET		_IOWR(, , u64)
> > 
> > The above 3 can be be folded into one "getregioninfo" call.
> 
> Yep, and the phys addr one below.  We can use a flags bit to indicate
> whether it's valid.
> 
> > > /* Return physical address for region index - not implemented for PCI */
> > > #define VFIO_DEVICE_GET_REGION_PHYS_ADDR	_IOWR(, , u64)
> > > 
> > > 
> > > 
> > > /* Return number of IRQs (Not including MSI/MSI-X for PCI) */
> > > #define VFIO_DEVICE_GET_NUM_IRQ			_IOR(, , int)
> > > 
> > > /* Set IRQ eventfd for IRQ index, arg[0] = index, arg[1] = fd */
> > > #define VFIO_DEVICE_SET_IRQ_EVENTFD		_IOW(, , int)
> > > 
> > > /* Unmask IRQ index */
> > > #define VFIO_DEVICE_UNMASK_IRQ			_IOW(, , int)
> > > 
> > > /* Set unmask eventfd for index, arg[0] = index, arg[1] = fd */
> > > #define VFIO_DEVICE_SET_UNMASK_IRQ_EVENTFD	_IOW(, , int)
> > > 
> > > 
> > > /* Return the device tree path for type/index into the user
> > >  * allocated buffer */
> > > struct dtpath {
> > > 	u32	type; (0 = region, 1 = IRQ)
> > > 	u32	index;
> > > 	u32	buf_len;
> > > 	char	*buf;
> > > };
> > > #define VFIO_DEVICE_GET_DTPATH			_IOWR(, , struct dtpath)
> > > 
> > > /* Return the device tree index for type/index */
> > > struct dtindex {
> > > 	u32	type; (0 = region, 1 = IRQ)
> > > 	u32	index;
> > > 	u32	prop_type;
> > > 	u32	prop_index;
> > > };
> > > #define VFIO_DEVICE_GET_DTINDEX			_IOWR(, , struct dtindex)
> > 
> > I think those need some work, but that doesn't impinge on the core
> > semantics.
> > 
> > > /* Reset the device */
> > > #define VFIO_DEVICE_RESET			_IO(, ,)
> > > 
> > > 
> > > /* PCI MSI setup, arg[0] = #, arg[1-n] = eventfds */
> > > #define VFIO_DEVICE_PCI_SET_MSI_EVENTFDS	_IOW(, , int)
> > > #define VFIO_DEVICE_PCI_SET_MSIX_EVENTFDS	_IOW(, , int)
> > 
> > Why does this need seperate controls, rather than just treating MSIs
> > as interrupts beyond the first for PCI devices?
> 
> Well, we could say that PCI will always report 3 for
> VFIO_DEVICE_GET_NUM_IRQ where 0 = legacy, 1 = MSI, 2 = MSI-X.  ioctls on
> unimplemented IRQs will fail, UNMASK* ioctls on non-level triggered
> interrupts will fail, and the parameter to SET_IRQ_EVENTFD becomes
> arg[0] = index, arg[1] = count, arg[2-n] = fd.  Maybe we'd then have a
> GET_IRQ_INFO that takes something like:
> 
> struct vfio_irq_info {
> 	int index;
> 	unsigned int count;
> 	u64 flags;
> #define VFIO_IRQ_INFO_FLAGS_LEVEL	(1 << 0)
> };
> 
> count would be 0 on PCI if the type of interrupt isn't supported.
> Better?  Thanks,

FYI for all, I've pushed a branch out to github with the current state
of the re-write.  You can find it here

https://awilliam@xxxxxxxxxx/awilliam/linux-vfio.git
git://github.com/awilliam/linux-vfio.git

The vfio-ng branch is the latest.  The framework is quite a bit more
solid now, so I figure it's time to move into the device and iommu
implementation.  vfio-pci is now it's own module that depends on vfio, I
expect vfio-dt to be implemented the same.  The PCI ioctl is in place
and supports the interface described above.  I'll continue to port
pieces of the old vfio code into this new infrastructure.  Comments and
patches welcome.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux