On Thu, Jul 14, 2022 at 02:39:32PM +0100, Robin Murphy wrote: > > In the meantime we do need to restore VFIO support under virtio-iommu, > > since userspace still expects that to work, and the existing use-cases are > > coherent devices. > > Yeah, I'm not necessarily against adding this as a horrible bodge for now - > the reality is that people using VFIO must be doing it on coherent systems > or it wouldn't be working properly anyway - as long as we all agree that > that's what it is. > > Next cycle I'll be sending the follow-up patches to bring > device_iommu_capable() to its final form (hoping the outstanding VDPA patch > lands in the meantime), at which point we get to sort-of-fix the SMMU > drivers[1], and can do something similar here too. I guess the main question > for virtio-iommu is whether it needs to be described/negotiated in the > protocol itself, or can be reliably described by other standard firmware > properties (with maybe just a spec not to clarify that coherency must be > consistent). What consumers of IOMMU_CAP_CACHE_COHERENCY now want to know, is whether coherency is managed in HW for one particular endpoint, or if they need to issue cache maintenance. The latter cannot be handled by VFIO since cache maintenance is generally privileged. So I had to list several possibilities regarding non-coherent accesses. I don't think we need a spec change. A. Accesses through physical IOMMU are never coherent ----------------------------------------------------- In this case, translated accesses from the physical device can't access memory coherently. The host would describe it using existing FW methods (dma-coherent in DT, _CCA in ACPI) since it's also needed without a vIOMMU. No change needed for virtio-iommu, I think, it can support non-coherent devices. It can also support mixing coherent and non-coherent devices in the same domain, because domains just allow to multiplex map requests at the moment. Since we allow sending the same map request onto two different domains, one with coherent devices and one with non-coherent ones, then we can also allow using a single domain for that. If the host cannot handle this, it is allowed to reject attach requests for incompatible devices. In Linux I think compatible() should include dev->dma_coherent after your change, or the callers should check dev->dma_coherent themselves (vfio-platform in particular) B. Non-cacheable mappings ------------------------- Here, accesses are normally coherent but the pIOMMU mappings may be configured to be non-coherent (non-cacheable access type on Arm). If there is an actual need for this, we could add a feature bit, a probe request property and a map flag. In Linux we may want to disallow !IOMMU_CACHE if the device is coherent, since we don't support this case. C. PCIe devices performing no-snoop accesses -------------------------------------------- Accesses are normally coherent but the device may set a transaction bit requesting the transaction to be non-coherent. A guest can't enable and use no-snoop in a PCIe device without knowing whether the system supports it. It's not discoverable on Arm, so a guest can't use it. On x86 I think it's always supported but the pIOMMU may enforce snoop (and the guest may be unable to perform cache maintenance? I didn't follow the whole wbinvd discussions for lack of time). The problem is the same without a vIOMMU, so I'd defer that to some firmware method describing no-snoop. D. Non-coherent virtio-iommu ---------------------------- Non-coherent virtqueues. It's not forbidden by the spec, and a transport driver could support it, but it's a transport problem and virtio-iommu doesn't need to know about it. Did I forget anything? Otherwise I don't think we need any spec change at the moment. But when adding support for page tables, we'll have to consider each of these cases since the guest will be able to set memory attributes and will care about page walks coherency. That will be bundled in a probe request along with the other page table information. Thanks, Jean _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization