On Fri, Dec 22, 2023 at 01:29:18PM +0100, Hannes Reinecke wrote: > And that is actually a very valid point; memory fragmentation will become an > issue with larger block sizes. > > Theoretically it should be quite easily solved; just switch the memory > subsystem to use the largest block size in the system, and run every smaller > memory allocation via SLUB (or whatever the allocator-of-the-day > currently is :-). Then trivially the system will never be fragmented, > and I/O can always use large folios. > > However, that means to do away with alloc_page(), which is still in > widespread use throughout the kernel. I would actually in favour of it, > but it might be that mm people have a different view. > > Matthew, worth a new topic? > Handling memory fragmentation on large block I/O systems? I think if we're going to do that as a topic (and I'm not opposed!), we need data. Various workloads, various block sizes, etc. Right now people discuss this topic with "feelings" and "intuition" and I think we need more than vibes to have a productive discussion. My laptop (rebooted last night due to an unfortunate upgrade that left anything accessing the sound device hanging ...): MemTotal: 16006344 kB MemFree: 2353108 kB Cached: 7957552 kB AnonPages: 4271088 kB Slab: 654896 kB so ~50% of my 16GB of memory is in the page cache and ~25% is anon memory. If the page cache is all in 16kB chunks and we need to allocate order-2 folios in order to read from a file, we can find it easily by reclaiming other order-2 folios from the page cache. We don't need to resort to heroics like eliminating use of alloc_page(). We should eliminate use of alloc_page() across most of the kernel, but that's a different topic and one that has not much relevance to LSF/MM since it's drivers that need to change, not the MM ;-) Now, other people "feel" differently. And that's cool, but we're not going to have a productive discussion without data that shows whose feelings represent reality and for which kinds of workloads.