Re: Project: Improving the PCP allocator

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Wed, 24 Jan 2024 21:03:44 +0000

On Wed, Jan 24, 2024 at 11:18:24AM -0800, Christoph Lameter (Ampere) wrote:
> On Mon, 22 Jan 2024, Matthew Wilcox wrote:
> 
> > When we have memdescs, allocating a folio from the buddy is a two step
> > process.  First we allocate the struct folio from slab, then we ask the
> > buddy allocator for 2^n pages, each of which gets its memdesc set to
> > point to this folio.  It'll be similar for other memory descriptors,
> > but let's keep it simple and just talk about folios for now.
> 
> I need to catch up on memdescs. One of the key issues may be fragmentation
> that occurs during alloc / free of folios of different sizes.

A lot of what we have now is opportunistic.  We'll use larger allocations
if they're readily available, and if not we'll fall back (and also kick
kswapd to try to free up some memory).  This is fine for the current
purposes, but may be less fine for the people who want to support large
LBA devices.  I don't think it'll be a problem as they should be able
to allocate more memory that is large enough, just by evicting memory
from the page cache that comes from the same device (so is by definition
large enough).

> Maybe we could use an approach similar to what the slab allocator uses to
> defrag. Allocate larger folios/pages and then break out sub
> folios/sizes/components until the page is full and recycle any frees of
> components in that page before going to the next large page.

It's certainly something we could do, but then we're back to setting
up the compound page again, and the idea was to avoid doing that.
So really this is a competing idea, not a complementary idea.