Re: Compaction & folios

Vlastimil Babka <vbabka@xxxxxxx> · Thu, 7 Oct 2021 12:06:08 +0200

On 10/7/21 00:53, Kent Overstreet wrote:
> So I have some observations on memory compaction & hugepages.
> 
> Right now, the working assumption in MM is that compaction is hard and
> expensive, and right now it is - because most allocations are order 0, with a
> small subset being hugepage order allocations. This means any time we need a
> hugepage, compaction has to move a bunch of order 0 pages around, and memory
> reclaim is no help here - when we reclaim memory, it's coming back as fragmented
> order 0 pages.
> 
> But what if compaction wasn't such a difficult, expensive operation?
> 
> With folios, and then folios for anonymous pages, we won't see nearly so many
> order 0 allocations anymore - we'll see a spread of allocation sizes based on a
> mixture of application usage patterns - something much closer to a poisson
> distribution, vs. our current very bimodal distribution. And since we won't be
> fragmenting all our allocations up front, memory reclaim will be freeing
> allocations in this same distribution.

Unfortunately, the main problem with compaction is not the act of moving a
number of LRU pages, but rather the presence of unmovable pages (slab, page
tables and whatnot kernel allocations), where such a single page makes the
whole 2MB block unusable. So I don't expect this would help dramatically for
compaction, but the points added by Matthew would still apply.

> Which means that any time an order n allocation fails, it's likely that we'll
> still have order n-1 pages free - and of those free order n-1 pages, one will
> likely have a buddy that's moveable and hasn't been fragmented - meaning the
> common case is that compaction will have to move _one_ (higher order) page -
> we'll almost never be having to move a bunch of 4k pages.
> 
> Another way of thinking of this is that memory reclaim will be doing most of the
> work that compaction has to do now to allocate a high order page. Compaction
> will go from an expensive, somewhat unreliable operation to one that mostly just
> works - it's going to be _much_ less of a pain point.
> 
> It may turn out that allocating hugepages still doesn't work as reliably as we'd
> like - but folios are still a big help even when we can't allocate a 2MB page,
> because we'll be able to fall back to an order 6 or 7 or 8 allocation, which is
> something we can't do now. And, since multiple CPU vendors now support
> coalescing contiguous PTE entries in the TLB, this will still get us most of the
> performance benefits of using hugepages.
>