> From: Jason Gunthorpe <jgg@xxxxxxxxxx> > Sent: Wednesday, April 6, 2022 12:16 AM > > PCIe defines a 'no-snoop' bit in each the TLP which is usually implemented > by a platform as bypassing elements in the DMA coherent CPU cache > hierarchy. A driver can command a device to set this bit on some of its > transactions as a micro-optimization. > > However, the driver is now responsible to synchronize the CPU cache with > the DMA that bypassed it. On x86 this is done through the wbinvd > instruction, and the i915 GPU driver is the only Linux DMA driver that > calls it. More accurately x86 supports both unprivileged clflush instructions to invalidate one cacheline and a privileged wbinvd instruction to invalidate the entire cache. Replacing 'this is done' with 'this may be done' is clearer. > > The problem comes that KVM on x86 will normally disable the wbinvd > instruction in the guest and render it a NOP. As the driver running in the > guest is not aware the wbinvd doesn't work it may still cause the device > to set the no-snoop bit and the platform will bypass the CPU cache. > Without a working wbinvd there is no way to re-synchronize the CPU cache > and the driver in the VM has data corruption. > > Thus, we see a general direction on x86 that the IOMMU HW is able to block > the no-snoop bit in the TLP. This NOP's the optimization and allows KVM to > to NOP the wbinvd without causing any data corruption. > > This control for Intel IOMMU was exposed by using IOMMU_CACHE and > IOMMU_CAP_CACHE_COHERENCY, however these two values now have > multiple > meanings and usages beyond blocking no-snoop and the whole thing has > become confused. Also point out your finding about AMD IOMMU? > > Change it so that: > - IOMMU_CACHE is only about the DMA coherence of normal DMAs from a > device. It is used by the DMA API and set when the DMA API will not be > doing manual cache coherency operations. > > - dev_is_dma_coherent() indicates if IOMMU_CACHE can be used with the > device > > - The new optional domain op enforce_cache_coherency() will cause the > entire domain to block no-snoop requests - ie there is no way for any > device attached to the domain to opt out of the IOMMU_CACHE behavior. > > An iommu driver should implement enforce_cache_coherency() so that by > default domains allow the no-snoop optimization. This leaves it available > to kernel drivers like i915. VFIO will call enforce_cache_coherency() > before establishing any mappings and the domain should then permanently > block no-snoop. > > If enforce_cache_coherency() fails VFIO will communicate back through to > KVM into the arch code via kvm_arch_register_noncoherent_dma() > (only implemented by x86) which triggers a working wbinvd to be made > available to the VM. > > While other arches are certainly welcome to implement > enforce_cache_coherency(), it is not clear there is any benefit in doing > so. > > After this series there are only two calls left to iommu_capable() with a > bus argument which should help Robin's work here. > > This is on github: > https://github.com/jgunthorpe/linux/commits/intel_no_snoop > > Cc: "Tian, Kevin" <kevin.tian@xxxxxxxxx> > Cc: Robin Murphy <robin.murphy@xxxxxxx> > Cc: Alex Williamson <alex.williamson@xxxxxxxxxx> > Cc: Christoph Hellwig <hch@xxxxxx> > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > > Jason Gunthorpe (5): > iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with > dev_is_dma_coherent() > vfio: Require that devices support DMA cache coherence > iommu: Introduce the domain op enforce_cache_coherency() > vfio: Move the Intel no-snoop control off of IOMMU_CACHE > iommu: Delete IOMMU_CAP_CACHE_COHERENCY > > drivers/infiniband/hw/usnic/usnic_uiom.c | 16 +++++------ > drivers/iommu/amd/iommu.c | 9 +++++-- > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 -- > drivers/iommu/arm/arm-smmu/arm-smmu.c | 6 ----- > drivers/iommu/arm/arm-smmu/qcom_iommu.c | 6 ----- > drivers/iommu/fsl_pamu_domain.c | 6 ----- > drivers/iommu/intel/iommu.c | 15 ++++++++--- > drivers/iommu/s390-iommu.c | 2 -- > drivers/vfio/vfio.c | 6 +++++ > drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++++-------- > drivers/vhost/vdpa.c | 3 ++- > include/linux/intel-iommu.h | 1 + > include/linux/iommu.h | 6 +++-- > 13 files changed, 58 insertions(+), 50 deletions(-) > > > base-commit: 3123109284176b1532874591f7c81f3837bbdc17 > -- > 2.35.1 _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization