On 17/10/18 16:14, Michael S. Tsirkin wrote:
On Mon, Oct 15, 2018 at 08:46:41PM +0100, Jean-philippe Brucker wrote:
[Replying with my personal address because we're having SMTP issues]
On 15/10/2018 11:52, Michael S. Tsirkin wrote:
On Fri, Oct 12, 2018 at 02:41:59PM -0500, Bjorn Helgaas wrote:
s/iommu/IOMMU/ in subject
On Fri, Oct 12, 2018 at 03:59:13PM +0100, Jean-Philippe Brucker wrote:
Using the iommu-map binding, endpoints in a given PCI domain can be
managed by different IOMMUs. Some virtual machines may allow a subset of
endpoints to bypass the IOMMU. In some case the IOMMU itself is presented
s/case/cases/
as a PCI endpoint (e.g. AMD IOMMU and virtio-iommu). Currently, when a
PCI root complex has an iommu-map property, the driver requires all
endpoints to be described by the property. Allow the iommu-map property to
have gaps.
I'm not an IOMMU or virtio expert, so it's not obvious to me why it is
safe to allow devices to bypass the IOMMU. Does this mean a typo in
iommu-map could inadvertently allow devices to bypass it?
Thinking about this comment, I would like to ask: can't the
virtio device indicate the ranges in a portable way?
This would minimize the dependency on dt bindings and ACPI,
enabling support for systems that have neither but do
have virtio e.g. through pci.
I thought about adding a PROBE request for this in virtio-iommu, but it
wouldn't be usable by a Linux guest because of a bootstrapping problem.
Hmm. At some level it seems wrong to design hardware interfaces
around how Linux happens to probe things. That can change at any time
...
This isn't Linux-specific though. In general it's somewhere between
difficult and impossible to pull in an IOMMU underneath a device after
at device is active, so if any OS wants to use an IOMMU, it's going to
want to know up-front that it's there and which devices it translates so
that it can program said IOMMU appropriately *before* potentially
starting DMA and/or interrupts from the relevant devices. Linux happens
to do things in that order (either by firmware-driven probe-deferral or
just perilous initcall ordering) because it is the only reasonable order
in which to do them. AFAIK the platforms which don't rely on any
firmware description of their IOMMU tend to have a fairly static system
architecture (such that the OS simply makes hard-coded assumptions), so
it's not necessarily entirely clear how they would cope with
virtio-iommu either way.
Robin.
Early on, Linux needs a description of device dependencies, to determine
in which order to probe them. If the device dependency was described by
virtio-iommu itself, the guest could for example initialize a NIC,
allocate buffers and start DMA on the physical address space (which aborts
if the IOMMU implementation disallows DMA by default), only to find out
once the virtio-iommu module is loaded that it needs to cancel all DMA and
reconfigure the NIC. With a static description such as iommu-map in DT or
ACPI remapping tables, the guest can defer probing of the NIC until the
IOMMU is initialized.
Thanks,
Jean
Could you point me at the code you refer to here?