On Fri, Oct 14, 2022 at 9:25 AM Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > > On Thu, Oct 13, 2022 at 11:58:22AM -0700, Saravana Kannan wrote: > > On Thu, Oct 13, 2022 at 9:57 AM Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > > > > If so, would there be concerns that the memory savings we get back from > > > > reducing the memory footprint of kmalloc might be defeated by how much > > > > memory is needed for bounce buffering? > > > > > > It's not necessarily about the saved memory but also locality of the > > > small buffer allocations, less cache and TLB pressure. > > > > Part of the pushback we get when we try to move some of the Android > > ecosystem from 32-bit to 64-bit is the memory usage increase. So, > > while the main goal might not be memory savings, it'll be good to keep > > that in mind too. I'd definitely not want this patch series to make > > things worse. Ideally, it'd make things better. 10MB is considered a > > lot on some of the super low speced devices. > > Well, we can still add the option to skip allocating from the small > kmalloc caches if there's no swiotlb available. This seems like a good compromise. > > > I wonder whether swiotlb is actually the best option for bouncing > > > unaligned buffers. We could use something like mempool_alloc() instead > > > if we stick to small buffers rather than any (even large) buffer that's > > > not aligned to a cache line. Or just go for kmem_cache_alloc() directly. > > > A downside is that we may need GFP_ATOMIC for such allocations, so > > > higher risk of failure. > > > > Yeah, a temporary kmem_cache_alloc() to bounce buffers off of feels > > like a better idea than swiotlb. Especially for small allocations (say > > 8 byte allocations) that might have gone into the kmem-cache-64 if we > > hadn't dropped KMALLOC_MIN_ALIGN to 8. > > I now remembered why this isn't trivial. On the dma_unmap_*() side, we > can't easily tell whether the buffer was bounced and it needs freeing. > The swiotlb solves this by checking whether the address is within the > pre-allocated swiotlb range. With kmem_caches, we could dig into the > slab internals and check whether the pointer is part of a slab page, > then check with kmem_cache that was. It looks too complicated and I'm > rather tempted to just go for swiotlb if available or prevent the > creation of small kmalloc caches if no swiotlb (TBH, even the initial > series was an improvement dropping KMALLOC_MIN_ALIGN from 128 to 64). Agreed. Even allowing a 64-byte kmalloc cache on a system with a 64-byte cacheline size saves quite a bit of memory. -Saravana