Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 09, 2011 at 08:11:54AM -0500, Stuart Yoder wrote:
> Based on the discussions over the last couple of weeks
> I have updated the device fd file layout proposal and
> tried to specify it a bit more formally.
> 
> ===============================================================
> 
> 1.  Overview
> 
>   This specification describes the layout of device files
>   used in the context of vfio, which gives user space
>   direct access to I/O devices that have been bound to
>   vfio.
> 
>   When a device fd is opened and read, offset 0x0 contains
>   a fixed sized header followed by a number of variable length
>   records that describe different characteristics
>   of the device-- addressable regions, interrupts, etc.
> 
>   0x0  +-------------+-------------+
>        |         magic             | u32  // identifies this as a vfio
> device file
>        +---------------------------+         and identifies the type of bus
>        |         version           | u32  // specifies the version of this
>        +---------------------------+
>        |         flags             | u32  // encodes any flags
>        +---------------------------+
>        |  dev info record 0        |
>        |    type                   | u32   // type of record
>        |    rec_len                | u32   // length in bytes of record
>        |                           |          (including record header)
>        |    flags                  | u32   // type specific flags
>        |    ...content...          |       // record content, which could
>        +---------------------------+       // include sub-records
>        |  dev info record 1        |
>        +---------------------------+
>        |  dev info record N        |
>        +---------------------------+

I really should have chimed in on this earlier, but I've been very
busy.

Um, not to put too fine a point on it, this is madness.

Yes, it's very flexible and can thereby cover a very wide range of
cases.  But it's much, much too complex.  Userspace has to parse a
complex, multilayered data structure, with variable length elements
just to get an address at which to do IO.  I can pretty much guarantee
that if we went with this, most userspace programs using this
interface would just ignore this metadata and directly map the
offsets at which they happen to know the kernel will put things for
the type of device they care about.

_At least_ for PCI, I think the original VFIO layout of each BAR at a
fixed, well known offset is much better.  Despite its limitations,
just advertising a "device type" ID which describes one of a few fixed
layouts would be preferable to this.  I'm still hoping, that we can do
a bit better than that.  But we should try really hard to at the very
least force the metadata into a simple array of resources each with a
fixed size record describing it, even if it means some space wastage
with occasionally-used fields.  Anything more complex than that and
userspace is just never going to use it properly.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux