Re: [PATCH v3 0/7] Add virtio-iommu driver

Auger Eric <eric.auger@xxxxxxxxxx> · Tue, 16 Oct 2018 22:31:18 +0200

Hi Jean,

On 10/16/18 8:44 PM, Jean-Philippe Brucker wrote:
> On 16/10/2018 10:25, Auger Eric wrote:
>> Hi Jean,
>>
>> On 10/12/18 4:59 PM, Jean-Philippe Brucker wrote:
>>> Implement the virtio-iommu driver, following specification v0.8 [1].
>>> Changes since v2 [2]:
>>>
>>> * Patches 2-4 allow virtio-iommu to use the PCI transport, since QEMU
>>>    would like to phase out the MMIO transport. This produces a complex
>>>    topology where the programming interface of the IOMMU could appear
>>>    lower than the endpoints that it translates. It's not unheard of (e.g.
>>>    AMD IOMMU), and the guest easily copes with this.
>>>    
>>>    The "Firmware description" section of the specification has been
>>>    updated with all combinations of PCI, MMIO and DT, ACPI.
>>
>> I have a question wrt the FW specification. The IOMMU consumes 1 slot in
>> the PCI domain and one needs to leave a RID hole in the iommu-map.  It
>> is not obvious to me that this RID always is predictable given the pcie
>> enumeration mechanism. Generally we have a coarse grain mapping of RID
>> onto iommu phandles/STREAMIDs. Here, if I understand correctly we need
>> to precisely identify the RID granted to the iommu. On QEMU this may
>> depend on the instantiation order of the virtio-pci device right?
> 
> Yes, although it should all happen before you boot the guest, since
> there is no hotplugging an IOMMU. Could you reserve a PCI slot upfront
> and use it for virtio-iommu later? Or generate the iommu-map at the same
> time as generating the child node of the PCI RC?

Even when cold-plugging the PCIe devices through qemu CLI, this depends
on the order of the pcie devices in the list I guess. I need to further
experiment.
> 
>> So
>> this does not look trivial to build this info. Isn't it possible to do
>> this exclusion at kernel level instead?
> 
> So in theory VIRTIO_F_IOMMU_PLATFORM already does that:
> 
> VIRTIO_F_IOMMU_PLATFORM(33)
>     This feature indicates that the device is behind an IOMMU that
>     translates bus addresses from the device into physical addresses in
>     memory. If this feature bit is set to 0, then the device emits
>     physical addresses which are not translated further, even though an
>     IOMMU may be present.

This tells the driver to use the dma api, right? Effectively this
explicitly says whether the device is supposed to be upfront an IOMMU.
> 
> For better or for worse, the guest has to implement it. If this feature
> bit is unset for virtio-iommu, it does DMA on the physical address
> space, regardless of what the static topology description says.
> 
> In practice it doesn't quite work. If your iommu-map describes the IOMMU
> as translating itself, Linux' OF code will wait for the IOMMU to be
> probed before probing the IOMMU. Working around this with hacks is
> possible, but I don't want to introduce more questionable code to OF and
> device tree bindings if there is any other way.
Hum ok. I cannot really comment on this.

I just wanted to raise this concern about RID identfication.

Thanks

Eric
> 
> Thanks,
> Jean
> 
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization