On Wed, Jan 24, 2024 at 11:18:24AM -0800, Christoph Lameter (Ampere) wrote: > On Mon, 22 Jan 2024, Matthew Wilcox wrote: > > > When we have memdescs, allocating a folio from the buddy is a two step > > process. First we allocate the struct folio from slab, then we ask the > > buddy allocator for 2^n pages, each of which gets its memdesc set to > > point to this folio. It'll be similar for other memory descriptors, > > but let's keep it simple and just talk about folios for now. > > I need to catch up on memdescs. One of the key issues may be fragmentation > that occurs during alloc / free of folios of different sizes. A lot of what we have now is opportunistic. We'll use larger allocations if they're readily available, and if not we'll fall back (and also kick kswapd to try to free up some memory). This is fine for the current purposes, but may be less fine for the people who want to support large LBA devices. I don't think it'll be a problem as they should be able to allocate more memory that is large enough, just by evicting memory from the page cache that comes from the same device (so is by definition large enough). > Maybe we could use an approach similar to what the slab allocator uses to > defrag. Allocate larger folios/pages and then break out sub > folios/sizes/components until the page is full and recycle any frees of > components in that page before going to the next large page. It's certainly something we could do, but then we're back to setting up the compound page again, and the idea was to avoid doing that. So really this is a competing idea, not a complementary idea.