On Thu, Apr 14, 2022 at 03:25:59PM -0700, Linus Torvalds wrote: > On Thu, Apr 14, 2022 at 12:49 PM Catalin Marinas > <catalin.marinas@xxxxxxx> wrote: > > It's a lot worse, ARCH_KMALLOC_MINALIGN is currently 128 bytes on arm64. > > I want to at least get it down to 64 with this series while preserving > > the current kmalloc() semantics. > > So here's a thought - maybe we could do the reverse of GFP_DMA, and > add a flag to the places that want small allocations and know they > don't need DMA? I wonder whether that's a lot more churn than trying to identify places where a small kmalloc()'ed buffer is passed to the DMA API. DMA into kmalloc() buffers should be a small fraction of the total kmalloc() uses. For kmem_cache we have the SLAB_HWCACHE_ALIGN flag. We can add a similar GFP_ flag as that's what we care about for DMA safety. It doesn't even need to force the alignment to ARCH_DMA_MINALIGN but just cache_line_size() (typically 64 on arm64 while ARCH_DMA_MINALIGN is 128 for about three platforms that have this requirement). Functions like dma_map_single() can be made to track down the origin of the buffer when size < cache_line_size() and warn if the slab is not correctly aligned. -- Catalin