On Fri, Oct 14, 2022 at 1:24 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote: > > Agreed. Even allowing a 64-byte kmalloc cache on a system with a > 64-byte cacheline size saves quite a bit of memory. Well, the *really* trivial thing to do is to just say "if the platform is DMA coherent, just allow any size kmalloc cache". And just consciously leave the broken garbage behind. Because it's not just 64-byte kmalloc. It's things like 'avtab_node' that is 24 bytes, and that on my system currently uses about 3MB of memory simply because there's a _lot_ of them. I've got 1.8MB in "kmalloc-32" too, and about 1MB in "kamlloc-16", fwiw. That's Yeah, yeah, this is on a 64GB machine and so none of that matters (and some of these things are very much "scales with memory", but these small allocations aren't actually all that unusual. And yes, the above is obviously on my x86-64 machine. My arm64 laptop doesn't have the small kmallocs, and as a result the "kmalloc-128" has 633 _thousand_ entries, and takes up 77MB of RAM on my 16GB laptop. I'm assuming (but have no proof) more than 50% of that is just wasted memory. I suspect that DMA is cache coherent on this thing, and that wasted space is just *stupid* wasted space, but hey, I don't actually know. I just use the Asahi people's patches. Linus