RE: [RFC 1/3] virtio-iommu: firmware description of the virtual topology

"Tian, Kevin" <kevin.tian@xxxxxxxxx> · Fri, 21 Apr 2017 08:43:43 +0000

> From: Jean-Philippe Brucker [mailto:jean-philippe.brucker@xxxxxxx]
> Sent: Wednesday, April 19, 2017 2:41 AM
> 
> On 18/04/17 10:51, Tian, Kevin wrote:
> >> From: Jean-Philippe Brucker
> >> Sent: Saturday, April 8, 2017 3:18 AM
> >>
> >> Unlike other virtio devices, the virtio-iommu doesn't work independently,
> >> it is linked to other virtual or assigned devices. So before jumping into
> >> device operations, we need to define a way for the guest to discover the
> >> virtual IOMMU and the devices it translates.
> >>
> >> The host must describe the relation between IOMMU and devices to the
> >> guest
> >> using either device-tree or ACPI. The virtual IOMMU identifies each
> >
> > Do you plan to support both device tree and ACPI?
> 
> Yes, with ACPI the topology would be described using IORT nodes. I didn't
> include an example in my driver because DT is sufficient for a prototype
> and is readily available (both in Linux and kvmtool), whereas IORT would
> be quite easy to reuse in Linux, but isn't present in kvmtool at the
> moment. However, both interfaces have to be supported for the virtio-
> iommu
> to be portable.

'portable' means whether guest enables ACPI?

> 
> >> virtual device with a 32-bit ID, that we will call "Device ID" in this
> >> document. Device IDs are not necessarily unique system-wide, but they
> may
> >> not overlap within a single virtual IOMMU. Device ID of passed-through
> >> devices do not need to match IDs seen by the physical IOMMU.
> >>
> >> The virtual IOMMU uses virtio-mmio transport exclusively, not virtio-pci,
> >> because with PCI the IOMMU interface would itself be an endpoint, and
> >> existing firmware interfaces don't allow to describe IOMMU<->master
> >> relations between PCI endpoints.
> >
> > I'm not familiar with virtio-mmio mechanism. Curious how devices in
> > virtio-mmio are enumerated today? Could we use that mechanism to
> > identify vIOMMUs and then invent a purely para-virtualized method to
> > enumerate devices behind each vIOMMU?
> 
> Using DT, virtio-mmio devices are described with "virtio-mmio" compatible
> node, and with ACPI they use _HID LNRO0005. Since the host already
> describes available devices to a guest using a firmware interface, I think
> we should reuse the tools provided by that interface for describing
> relations between DMA masters and IOMMU.

OK, I didn't realize virtio-mmio is defined to rely on DT for enumeration.

> 
> > Asking this is because each vendor has its own enumeration methods.
> > ARM has device tree and ACPI IORT. AMR has ACPI IVRS and device
> > tree (same format as ARM?). Intel has APCI DMAR and sub-tables. Your
> > current proposal looks following ARM definitions which I'm not sure
> > extensible enough to cover features defined only in other vendors'
> > structures.
> 
> ACPI IORT can be extended to incorporate para-virtualized IOMMUs,
> regardless of the underlying architecture. It isn't defined solely for the
> ARM SMMU, but serves a more general purpose of describing a map of
> device
> identifiers communicated from one components to another. Both DMAR and
> IVRS have such description (respectively DRHD and IVHD), but they are
> designed for a specific IOMMU, whereas IORT could host other kinds.

I'll take a look at IORT definition. DRHD includes information more
than device mapping.

> 
> It seems that all we really need is an interface that says "there is a
> virtio-iommu at address X, here are the devices it translates and their
> corresponding IDs", and both DT and ACPI IORT are able to fulfill this role.
> 
> > Since the purpose of this series is to go para-virtualize, why not also
> > para-virtualize and simplify the enumeration method? For example,
> > we may define a query interface through vIOMMU registers to allow
> > guest query whether a device belonging to that vIOMMU. Then we
> > can even remove use of any enumeration structure completely...
> > Just a quick example which I may not think through all the pros and
> > cons. :-)
> 
> I don't think adding a brand new topology description mechanism is worth
> the effort, we're better off reusing what already exists and is
> implemented by operating systems. Adding a query interface inside the
> vIOMMU may work (though might be very painful to integrate with fwspec in
> Linux), but would be redundant since the host has to provide a firmware
> description of the system anyway.
> 
> >> The following diagram describes a situation where two virtual IOMMUs
> >> translate traffic from devices in the system. vIOMMU 1 translates two PCI
> >> domains, in which each function has a 16-bits requester ID. In order for
> >> the vIOMMU to differentiate guest requests targeted at devices in each
> >> domain, their Device ID ranges cannot overlap. vIOMMU 2 translates two
> PCI
> >> domains and a collection of platform devices.
> >>
> >>                        Device ID    Requester ID
> >>                   /       0x0           0x0      \
> >>                  /         |             |        PCI domain 1
> >>                 /      0xffff           0xffff   /
> >>         vIOMMU 1
> >>                 \     0x10000           0x0      \
> >>                  \         |             |        PCI domain 2
> >>                   \   0x1ffff           0xffff   /
> >>
> >>                   /       0x0                    \
> >>                  /         |                      platform devices
> >>                 /      0x1fff                    /
> >>         vIOMMU 2
> >>                 \      0x2000           0x0      \
> >>                  \         |             |        PCI domain 3
> >>                   \   0x11fff           0xffff   /
> >>
> >
> > isn't above be (0x30000, 3ffff) for PCI domain 3 giving device ID is 16bit?
> 
> Unlike Requester IDs in PCI, there is no architected rule for IDs of
> platform devices, it's an integration choice. The ID of platform device is
> used exclusively for interfacing with an IOMMU (or MSI controller), it
> doesn't mean anything outside this context. Here the host allocates 13
> bits to platform device IDs, which is legal.
> 

Please add such explanation to your next version. In earlier text
"16-bits request ID" is mentioned for vIOMMU1, which gave me
the illusion that same 16bit applies to vIOMMU2 too.

Thanks
Kevin
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization