On Tue, 5 Apr 2022 13:16:02 -0300 Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY and > IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU. > > Currently only Intel and AMD IOMMUs are known to support this > feature. They both implement it as an IOPTE bit, that when set, will cause > PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though > the no-snoop bit was clear. > > The new API is triggered by calling enforce_cache_coherency() before > mapping any IOVA to the domain which globally switches on no-snoop > blocking. This allows other implementations that might block no-snoop > globally and outside the IOPTE - AMD also documents such an HW capability. > > Leave AMD out of sync with Intel and have it block no-snoop even for > in-kernel users. This can be trivially resolved in a follow up patch. > > Only VFIO will call this new API. > > Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > --- > drivers/iommu/amd/iommu.c | 7 +++++++ > drivers/iommu/intel/iommu.c | 14 +++++++++++++- > include/linux/intel-iommu.h | 1 + > include/linux/iommu.h | 4 ++++ > 4 files changed, 25 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c > index a1ada7bff44e61..e500b487eb3429 100644 > --- a/drivers/iommu/amd/iommu.c > +++ b/drivers/iommu/amd/iommu.c > @@ -2271,6 +2271,12 @@ static int amd_iommu_def_domain_type(struct device *dev) > return 0; > } > > +static bool amd_iommu_enforce_cache_coherency(struct iommu_domain *domain) > +{ > + /* IOMMU_PTE_FC is always set */ > + return true; > +} > + > const struct iommu_ops amd_iommu_ops = { > .capable = amd_iommu_capable, > .domain_alloc = amd_iommu_domain_alloc, > @@ -2293,6 +2299,7 @@ const struct iommu_ops amd_iommu_ops = { > .flush_iotlb_all = amd_iommu_flush_iotlb_all, > .iotlb_sync = amd_iommu_iotlb_sync, > .free = amd_iommu_domain_free, > + .enforce_cache_coherency = amd_iommu_enforce_cache_coherency, > } > }; > > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c > index df5c62ecf942b8..f08611a6cc4799 100644 > --- a/drivers/iommu/intel/iommu.c > +++ b/drivers/iommu/intel/iommu.c > @@ -4422,7 +4422,8 @@ static int intel_iommu_map(struct iommu_domain *domain, > prot |= DMA_PTE_READ; > if (iommu_prot & IOMMU_WRITE) > prot |= DMA_PTE_WRITE; > - if ((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) > + if (((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) || > + dmar_domain->enforce_no_snoop) > prot |= DMA_PTE_SNP; > > max_addr = iova + size; > @@ -4545,6 +4546,16 @@ static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain, > return phys; > } > > +static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain) > +{ > + struct dmar_domain *dmar_domain = to_dmar_domain(domain); > + > + if (!dmar_domain->iommu_snooping) > + return false; > + dmar_domain->enforce_no_snoop = true; > + return true; > +} Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't support it, ie. reserved register bit set in pte faults? It seems really inconsistent here that I could make a domain that supports iommu_snooping, set enforce_no_snoop = true, then add another DMAR to the domain that may not support iommu_snooping, I'd get false on the subsequent enforcement test, but the dmar_domain is still trying to use DMA_PTE_SNP. There's also a disconnect, maybe just in the naming or documentation, but if I call enforce_cache_coherency for a domain, that seems like the domain should retain those semantics regardless of how it's modified, ie. "enforced". For example, if I tried to perform the above operation, I should get a failure attaching the device that brings in the less capable DMAR because the domain has been set to enforce this feature. If the API is that I need to re-enforce_cache_coherency on every modification of the domain, shouldn't dmar_domain->enforce_no_snoop also return to a default value on domain changes? Maybe this should be something like set_no_snoop_squashing with the above semantics, it needs to be re-applied whenever the domain:device composition changes? Thanks, Alex > + > static bool intel_iommu_capable(enum iommu_cap cap) > { > if (cap == IOMMU_CAP_CACHE_COHERENCY) > @@ -4898,6 +4909,7 @@ const struct iommu_ops intel_iommu_ops = { > .iotlb_sync = intel_iommu_tlb_sync, > .iova_to_phys = intel_iommu_iova_to_phys, > .free = intel_iommu_domain_free, > + .enforce_cache_coherency = intel_iommu_enforce_cache_coherency, > } > }; > > diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h > index 2f9891cb3d0014..1f930c0c225d94 100644 > --- a/include/linux/intel-iommu.h > +++ b/include/linux/intel-iommu.h > @@ -540,6 +540,7 @@ struct dmar_domain { > u8 has_iotlb_device: 1; > u8 iommu_coherency: 1; /* indicate coherency of iommu access */ > u8 iommu_snooping: 1; /* indicate snooping control feature */ > + u8 enforce_no_snoop : 1; /* Create IOPTEs with snoop control */ > > struct list_head devices; /* all devices' list */ > struct iova_domain iovad; /* iova's that belong to this domain */ > diff --git a/include/linux/iommu.h b/include/linux/iommu.h > index 9208eca4b0d1ac..fe4f24c469c373 100644 > --- a/include/linux/iommu.h > +++ b/include/linux/iommu.h > @@ -272,6 +272,9 @@ struct iommu_ops { > * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush > * queue > * @iova_to_phys: translate iova to physical address > + * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE, > + * including no-snoop TLPs on PCIe or other platform > + * specific mechanisms. > * @enable_nesting: Enable nesting > * @set_pgtable_quirks: Set io page table quirks (IO_PGTABLE_QUIRK_*) > * @free: Release the domain after use. > @@ -300,6 +303,7 @@ struct iommu_domain_ops { > phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, > dma_addr_t iova); > > + bool (*enforce_cache_coherency)(struct iommu_domain *domain); > int (*enable_nesting)(struct iommu_domain *domain); > int (*set_pgtable_quirks)(struct iommu_domain *domain, > unsigned long quirks);