On Tue, Nov 01, 2022 at 06:14:58PM +0000, Robin Murphy wrote: > On 2022-11-01 17:19, Catalin Marinas wrote: > > The bouncing currently is all or nothing with iommu_dma_map_sg(), unlike > > dma_direct_map_sg() which ends up calling dma_direct_map_page() and we > > can do the bouncing per element. So I was looking to untangle > > iommu_dma_map_sg() in a similar way but postponed it as too complicated. > > > > As a less than optimal solution, we can force bouncing for the whole > > list if any of the sg elements is below the alignment size. Hopefully we > > won't have many such mixed size cases. > > Sounds like you may have got the wrong impression - the main difference with > iommu_dma_map_sg_swiotlb() is that it avoids trying to do any of the clever > concatenation stuff, and simply maps each segment individually with > iommu_dma_map_page(), exactly like dma-direct; only segments which need > bouncing actually get bounced. You are right, the iommu_dma_map_page() is called for each element if bouncing is needed. But without scanning the sg separately, dev_use_swiotlb() would have to be true for all non-coherent devices to force it through that path. As you said below, this would break some use-cases. > What sadly wouldn't work is just adding extra conditions to > dev_use_swiotlb() to go down the existing bounce-if-necessary path for all > non-coherent devices, since there are non-coherent users of dma-buf and v4l2 > which (for better or worse) depend on the clever concatenation stuff > happening. Would such cases have a length < ARCH_DMA_MINALIGN for any of the scatterlist elements? If not, maybe scanning the list first would work, though we probably do need a dma_flag to avoid scanning it again for sync and unmap. -- Catalin