Re: RFC: vfio / device assignment -- layout of device fd files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/29/2011 05:46 PM, Alex Williamson wrote:
> On Mon, 2011-08-29 at 16:58 -0500, Scott Wood wrote:
>> On 08/29/2011 02:51 PM, Alex Williamson wrote:
>>> On Mon, 2011-08-29 at 16:51 +0000, Yoder Stuart-B08248 wrote:
>>>> The device info records following the file header have the following
>>>> record types each with content encoded in a record specific way:
>>>>
>>>>  REGION  - describes an addressable address range for the device
>>>>  DTPATH - describes the device tree path for the device
>>>>  DTINDEX - describes the index into the related device tree
>>>>            property (reg,ranges,interrupts,interrupt-map)
>>>
>>> I don't quite understand if these are physical or virtual.
>>
>> If what are physical or virtual?
> 
> Can you give an example of a path vs an index?  I don't understand
> enough about these to ask a useful question about what they're
> describing.

You'd have both path and index.

Example, for this tree:

/ {
	...
	foo {
		...
		bar {
			reg = <0x1000 64 0x1800 64>;
			ranges = <0 0x20000 0x10000>;
			...

			child {
				reg = <0x100 0x100>;
				...
			};
		};
	};
};

There would be 4 regions if you bind to /foo/bar:

// this is 64 bytes at 0x1000
DTPATH "/foo/bar"
DTINDEX prop_type=REG prop_index=0

// this is 64 bytes at 0x1800
DTPATH "/foo/bar"
DTINDEX prop_type=REG prop_index=1

// this is 16K at 0x20000
DTPATH "/foo/bar"
DTINDEX prop_type=RANGES prop_index=0

// this is 256 bytes at 0x20100
DTPATH "/foo/bar/child"
DTINDEX prop_type=REG prop_index=0

Both ranges and the child reg are needed, since ranges could be a simple
"ranges;" that passes everything with no translation, and child nodes
could be absent-but-implied in some other cases (such as when they
represent PCI devices which can be probed -- we still need to map the
ranges that correspond to PCI controller windows).

>>>>  INTERRUPT - describes an interrupt for the device
>>>>  PCI_CONFIG_SPACE - describes config space for the device
>>>
>>> I would have expected this to be a REGION with a property of
>>> PCI_CONFIG_SPACE.
>>
>> Could be, if physical address is made optional.
> 
> Or physical address is also a property, aka sub-region.

A subrecord of REGION is fine with me.

>>> Would we only need to expose phys addr for 1:1 mapping requirements?
>>> I'm not sure why we'd care to expose this otherwise.
>>
>> It's more important for non-PCI, where it avoids the need for userspace
>> to parse the device tree to find the guest address (we'll usually want
>> 1:1), or to consolidate pages shared by multiple regions.  It could be
>> nice for debugging, as well.
> 
> So the device tree path is ripped straight from the system, so it's the
> actual 1:1, matching physical hardware, path.

Yes.

>>> Even for non-PCI we need to
>>> know if the region is pio/mmio32/mmio64/prefetchable/etc.
>>
>> Outside of PCI, what standardized form would you put such information
>> in?  Where would the kernel get this information?  What does
>> mmio32/mmio64 mean in this context?
> 
> I could imagine a platform device described by ACPI that might want to
> differentiate.  The physical device doesn't get moved of course, but
> guest drivers might care how the device is described if we need to
> rebuild those ACPI tables.  ACPI might even be a good place to leverage
> these data structures... /me ducks.

ACPI info could be another subrecord type, but in the device tree
system-bus case we generally don't have this information at the generic
infrastructure level.  Drivers are expected to know how their devices'
regions should be mapped.

>>> BAR index could really just translate to a REGION instance number.
>>
>> How would that work if you make non-BAR things (such as config space)
>> into regions?
> 
> Put their instance numbers outside of the BAR region?  We have a fixed
> REGION space on PCI, so we could just define BAR0 == instance 0, BAR1 ==
> instance 1... ROM == instance 6, CONFIG == instance 0xF (or 7).

Seems more awkward than just having each region say what it is.  What do
you do to fill in the gaps?

-Scott

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux