On Fri, Sep 30, 2022 at 07:32:50PM +0100, Catalin Marinas wrote: > I started refreshing the series but I got stuck on having to do bouncing > for small buffers even if when they go through the iommu (and I don't > have the set up to test it yet). For devices that go through the IOMMU, are you planning on adding similar logic as you did in the direct-DMA path to bounce the buffer prior to calling into whatever DMA ops are registered for the device? Also, there are devices with ARM64 CPUs that disable SWIOTLB usage because none of the peripherals that they engage in DMA with need bounce buffering, and also to reclaim the default 64 MB of memory that SWIOTLB uses. With this approach, SWIOTLB usage will become mandatory if those devices need to perform non-coherent DMA transactions that may not necessarily be DMA aligned (e.g. small buffers), correct? If so, would there be concerns that the memory savings we get back from reducing the memory footprint of kmalloc might be defeated by how much memory is needed for bounce buffering? I understand that we can use the "swiotlb=num_slabs" command line parameter to minimize the amount of memory allocated for bounce buffering. If this is the only way to minimize this impact, how much memory would you recommend to allocate for bounce buffering on a system that will only use bounce buffers for non-DMA-aligned buffers? Thanks, Isaac