+Robin. > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > Sent: Thursday, September 23, 2021 8:22 PM > > On Thu, Sep 23, 2021 at 12:05:29PM +0000, Tian, Kevin wrote: > > > From: Jason Gunthorpe <jgg@xxxxxxxxxx> > > > Sent: Thursday, September 23, 2021 7:27 PM > > > > > > On Thu, Sep 23, 2021 at 11:15:24AM +0100, Jean-Philippe Brucker wrote: > > > > > > > So we can only tell userspace "No_snoop is not supported" (provided > we > > > > even want to allow them to enable No_snoop). Users in control of > stage-1 > > > > tables can create non-cacheable mappings through MAIR attributes. > > > > > > My point is that ARM is using IOMMU_CACHE to control the overall > > > cachability of the DMA > > > > > > ie not specifying IOMMU_CACHE requires using the arch specific DMA > > > cache flushers. > > > > > > Intel never uses arch specifc DMA cache flushers, and instead is > > > abusing IOMMU_CACHE to mean IOMMU_BLOCK_NO_SNOOP on DMA > that > > > is always > > > cachable. > > > > it uses IOMMU_CACHE to force all DMAs to snoop, including those which > > has non_snoop flag and wouldn't snoop cache if iommu is disabled. > Nothing > > is blocked. > > I see it differently, on Intel the only way to bypass the cache with > DMA is to specify the no-snoop bit in the TLP. The IOMMU PTE flag we > are talking about tells the IOMMU to ignore the no snoop bit. > > Again, Intel arch in the kernel does not support the DMA cache flush > arch API and *DOES NOT* support incoherent DMA at all. > > ARM *does* implement the DMA cache flush arch API and is using > IOMMU_CACHE to control if the caller will, or will not call the cache > flushes. I still didn't fully understand this point after reading the code. Looking at dma-iommu its cache flush functions are all coded with below as the first check: if (dev_is_dma_coherent(dev) && !dev_is_untrusted(dev)) return; dev->dma_coherent is initialized upon firmware info, not decided by IOMMU_CACHE. i.e. it's not IOMMU_CACHE to decide whether cache flushes should be called. Probably the confusion comes from __iommu_dma_alloc_noncontiguous: if (!(ioprot & IOMMU_CACHE)) { struct scatterlist *sg; int i; for_each_sg(sgt->sgl, sg, sgt->orig_nents, i) arch_dma_prep_coherent(sg_page(sg), sg->length); } Here it makes more sense to be if (!coherent) {}. with above being corrected, I think all iommu drivers do associate IOMMU_CACHE to the snoop aspect: Intel: - either force snooping by ignoring snoop bit in TLP (IOMMU_CACHE) - or has snoop decided by TLP (!IOMMU_CACHE) ARM: - set to snoop format if IOMMU_CACHE - set to nonsnoop format if !IOMMU_CACHE (in both cases TLP snoop bit is ignored?) Other archs - ignore IOMMU_CACHE as cache is always snooped via their IOMMUs > > This is fundamentally different from what Intel is using it for. > > > but why do you call it abuse? IOMMU_CACHE was first introduced for > > Intel platform: > > IMHO ARM changed the meaning when Robin linked IOMMU_CACHE to > dma_is_coherent stuff. At that point it became linked to 'do I need to > call arch cache flushers or not'. > I didn't identify the exact commit for above meaning change. Robin, could you help share some thoughts here? Thanks Kevin