On Fri, Sep 09, 2011 at 08:11:54AM -0500, Stuart Yoder wrote: > Based on the discussions over the last couple of weeks > I have updated the device fd file layout proposal and > tried to specify it a bit more formally. > > =============================================================== > > 1. Overview > > This specification describes the layout of device files > used in the context of vfio, which gives user space > direct access to I/O devices that have been bound to > vfio. > > When a device fd is opened and read, offset 0x0 contains > a fixed sized header followed by a number of variable length > records that describe different characteristics > of the device-- addressable regions, interrupts, etc. > > 0x0 +-------------+-------------+ > | magic | u32 // identifies this as a vfio > device file > +---------------------------+ and identifies the type of bus > | version | u32 // specifies the version of this > +---------------------------+ > | flags | u32 // encodes any flags > +---------------------------+ > | dev info record 0 | > | type | u32 // type of record > | rec_len | u32 // length in bytes of record > | | (including record header) > | flags | u32 // type specific flags > | ...content... | // record content, which could > +---------------------------+ // include sub-records > | dev info record 1 | > +---------------------------+ > | dev info record N | > +---------------------------+ I really should have chimed in on this earlier, but I've been very busy. Um, not to put too fine a point on it, this is madness. Yes, it's very flexible and can thereby cover a very wide range of cases. But it's much, much too complex. Userspace has to parse a complex, multilayered data structure, with variable length elements just to get an address at which to do IO. I can pretty much guarantee that if we went with this, most userspace programs using this interface would just ignore this metadata and directly map the offsets at which they happen to know the kernel will put things for the type of device they care about. _At least_ for PCI, I think the original VFIO layout of each BAR at a fixed, well known offset is much better. Despite its limitations, just advertising a "device type" ID which describes one of a few fixed layouts would be preferable to this. I'm still hoping, that we can do a bit better than that. But we should try really hard to at the very least force the metadata into a simple array of resources each with a fixed size record describing it, even if it means some space wastage with occasionally-used fields. Anything more complex than that and userspace is just never going to use it properly. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html