Hi Robin, On 20.08.2020 17:08, Robin Murphy wrote: > With the IOMMU ops now looking much the same shape as iommu_dma_ops, > switch them out in favour of the iommu-dma library, currently enhanced > with temporary workarounds that allow it to also sit underneath the > arch-specific API. With that in place, we can now start converting the > remaining IOMMU drivers and consumers to work with IOMMU API default > domains instead. > > Signed-off-by: Robin Murphy <robin.murphy@xxxxxxx> I've played a bit longer with this and found that reading the kernel virtual address of the buffers allocated via dma_alloc_attrs() from dma-iommu ops gives trashes from time to time. It took me a while to debug this... Your conversion misses adding arch_dma_prep_coherent() to arch/arm, so the buffers are cleared by the mm allocator, but the caches are NOT flushed for the newly allocated buffers. This fixes the issue: diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index fec3e59215b8..8b60bcc5b14f 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -2,6 +2,7 @@ config ARM bool default y + select ARCH_HAS_DMA_PREP_COHERENT select ARCH_32BIT_OFF_T select ARCH_HAS_BINFMT_FLAT select ARCH_HAS_DEBUG_VIRTUAL if MMU diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index ff6c4962161a..6954681b73da 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -266,6 +266,20 @@ static void __dma_clear_buffer(struct page *page, size_t size, int coherent_flag } } +void arch_dma_prep_coherent(struct page *page, size_t size) +{ + + if (PageHighMem(page)) { + phys_addr_t base = __pfn_to_phys(page_to_pfn(page)); + phys_addr_t end = base + size; + outer_flush_range(base, end); + } else { + void *ptr = page_address(page); + dmac_flush_range(ptr, ptr + size); + outer_flush_range(__pa(ptr), __pa(ptr) + size); + } +} + /* * Allocate a DMA buffer for 'dev' of size 'size' using the * specified gfp mask. Note that 'size' must be page aligned. I also wonder if it would be better to use per-arch __dma_clear_buffer() instead of setting __GFP_ZERO unconditionally in dma-iommu.c. This should be faster on ARM with highmem... > ... Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland