On 21/04/17 09:43, Tian, Kevin wrote: >> From: Jean-Philippe Brucker [mailto:jean-philippe.brucker@xxxxxxx] >> Sent: Wednesday, April 19, 2017 2:41 AM >> >> On 18/04/17 10:51, Tian, Kevin wrote: >>>> From: Jean-Philippe Brucker >>>> Sent: Saturday, April 8, 2017 3:18 AM >>>> >>>> Unlike other virtio devices, the virtio-iommu doesn't work independently, >>>> it is linked to other virtual or assigned devices. So before jumping into >>>> device operations, we need to define a way for the guest to discover the >>>> virtual IOMMU and the devices it translates. >>>> >>>> The host must describe the relation between IOMMU and devices to the >>>> guest >>>> using either device-tree or ACPI. The virtual IOMMU identifies each >>> >>> Do you plan to support both device tree and ACPI? >> >> Yes, with ACPI the topology would be described using IORT nodes. I didn't >> include an example in my driver because DT is sufficient for a prototype >> and is readily available (both in Linux and kvmtool), whereas IORT would >> be quite easy to reuse in Linux, but isn't present in kvmtool at the >> moment. However, both interfaces have to be supported for the virtio- >> iommu >> to be portable. > > 'portable' means whether guest enables ACPI? Sorry, "supported" isn't the right term for what I meant. It is for firmware interface to accommodate devices, not the other way around, so firmware consideration is outside the scope of the virtio-iommu specification and virtio-iommu itself doesn't need to "support" any interface. For the purpose of this particular document however, both popular firmware interfaces (ACPI and DT) must be taken into account. Those are the two interfaces I know about, there might be others. But I figure that a VMM implementing a virtual IOMMU is complex enough to be able to also implement one of these two interfaces, so talking about DT and ACPI should fit all use cases. It also provides two examples for other firmware interfaces that wish to describe the IOMMU topology. >>>> virtual device with a 32-bit ID, that we will call "Device ID" in this >>>> document. Device IDs are not necessarily unique system-wide, but they >> may >>>> not overlap within a single virtual IOMMU. Device ID of passed-through >>>> devices do not need to match IDs seen by the physical IOMMU. >>>> >>>> The virtual IOMMU uses virtio-mmio transport exclusively, not virtio-pci, >>>> because with PCI the IOMMU interface would itself be an endpoint, and >>>> existing firmware interfaces don't allow to describe IOMMU<->master >>>> relations between PCI endpoints. >>> >>> I'm not familiar with virtio-mmio mechanism. Curious how devices in >>> virtio-mmio are enumerated today? Could we use that mechanism to >>> identify vIOMMUs and then invent a purely para-virtualized method to >>> enumerate devices behind each vIOMMU? >> >> Using DT, virtio-mmio devices are described with "virtio-mmio" compatible >> node, and with ACPI they use _HID LNRO0005. Since the host already >> describes available devices to a guest using a firmware interface, I think >> we should reuse the tools provided by that interface for describing >> relations between DMA masters and IOMMU. > > OK, I didn't realize virtio-mmio is defined to rely on DT for enumeration. Not necessarily DT, you can have virtio-mmio devices in ACPI namespace as well. Qemu has a an example of LNRO0005 with ACPI. >>> Asking this is because each vendor has its own enumeration methods. >>> ARM has device tree and ACPI IORT. AMR has ACPI IVRS and device >>> tree (same format as ARM?). Intel has APCI DMAR and sub-tables. Your >>> current proposal looks following ARM definitions which I'm not sure >>> extensible enough to cover features defined only in other vendors' >>> structures. >> >> ACPI IORT can be extended to incorporate para-virtualized IOMMUs, >> regardless of the underlying architecture. It isn't defined solely for the >> ARM SMMU, but serves a more general purpose of describing a map of >> device >> identifiers communicated from one components to another. Both DMAR and >> IVRS have such description (respectively DRHD and IVHD), but they are >> designed for a specific IOMMU, whereas IORT could host other kinds. > > I'll take a look at IORT definition. DRHD includes information more > than device mapping. I guess that most information provided by DMAR and others are IOMMU-specific and the equivalent for virtio-iommu would fit in virtio config space. But describing device mapping relative to IOMMUs is the same problem for all systems. Doing it with a virtio-iommu probing mechanism would require to reinvent a way to identify devices every time a host wants to add support for a new bus (RID for PCI, base address for MMIO, others in the future), when firmwares would have to provide this information anyway for bare metal. >> It seems that all we really need is an interface that says "there is a >> virtio-iommu at address X, here are the devices it translates and their >> corresponding IDs", and both DT and ACPI IORT are able to fulfill this role. >> >>> Since the purpose of this series is to go para-virtualize, why not also >>> para-virtualize and simplify the enumeration method? For example, >>> we may define a query interface through vIOMMU registers to allow >>> guest query whether a device belonging to that vIOMMU. Then we >>> can even remove use of any enumeration structure completely... >>> Just a quick example which I may not think through all the pros and >>> cons. :-) >> >> I don't think adding a brand new topology description mechanism is worth >> the effort, we're better off reusing what already exists and is >> implemented by operating systems. Adding a query interface inside the >> vIOMMU may work (though might be very painful to integrate with fwspec in >> Linux), but would be redundant since the host has to provide a firmware >> description of the system anyway. >> >>>> The following diagram describes a situation where two virtual IOMMUs >>>> translate traffic from devices in the system. vIOMMU 1 translates two PCI >>>> domains, in which each function has a 16-bits requester ID. In order for >>>> the vIOMMU to differentiate guest requests targeted at devices in each >>>> domain, their Device ID ranges cannot overlap. vIOMMU 2 translates two >> PCI >>>> domains and a collection of platform devices. >>>> >>>> Device ID Requester ID >>>> / 0x0 0x0 \ >>>> / | | PCI domain 1 >>>> / 0xffff 0xffff / >>>> vIOMMU 1 >>>> \ 0x10000 0x0 \ >>>> \ | | PCI domain 2 >>>> \ 0x1ffff 0xffff / >>>> >>>> / 0x0 \ >>>> / | platform devices >>>> / 0x1fff / >>>> vIOMMU 2 >>>> \ 0x2000 0x0 \ >>>> \ | | PCI domain 3 >>>> \ 0x11fff 0xffff / >>>> >>> >>> isn't above be (0x30000, 3ffff) for PCI domain 3 giving device ID is 16bit? >> >> Unlike Requester IDs in PCI, there is no architected rule for IDs of >> platform devices, it's an integration choice. The ID of platform device is >> used exclusively for interfacing with an IOMMU (or MSI controller), it >> doesn't mean anything outside this context. Here the host allocates 13 >> bits to platform device IDs, which is legal. >> > > Please add such explanation to your next version. In earlier text > "16-bits request ID" is mentioned for vIOMMU1, which gave me > the illusion that same 16bit applies to vIOMMU2 too. Sure, I will clarify this. Thanks, Jean-Philippe