Re: [PATCH] mm/page_alloc: Clarify some migratetype fallback code

Vlastimil Babka <vbabka@xxxxxxx> · Tue, 18 Feb 2025 11:19:58 +0100

On 2/17/25 17:26, Brendan Jackman wrote:
> On Fri, 14 Feb 2025 at 22:26, Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
>> >  2. can_steal_fallback() sounds as though it's answering the question "am
>> >     I functionally permitted to allocate from that other type" but in
>> >     fact it is encoding a heuristic preference.
>>
>> Here I don't see that nuance tbh.

I think I do.

>>
>> >  3. The same piece of data has different names in different places:
>> >     can_steal vs whole_block. This reinforces point 2 because it looks

I think some of the weirdness comes from the time before migratetype hygiene
series. IIRC it was possible to steal just the page we want to allocate,
that and extra pages but not the whole block, or whole block. Now it's
either just the page or whole block.

>> >     like the different names reflect a shift in intent from "am I
>> >     allowed to steal" to "do I want to steal", but no such shift exists.
>> >
>> > Fix 1. by avoiding the term "steal" in ambiguous contexts. This fixes
>> > 3. automatically since the natural name for can_steal is whole_block.
>>
>> I'm not a fan of whole_block because it loses the action verb. It
>> makes sense in the context of steal_suitable_fallback(), but becomes
>> ominous in find_suitable_fallback().
>>
>> Maybe @block_claimable would be better?

How about @claim_block ? That's even more "action verb" as the verb is not
passive.

> 
> Yeah that sounds good to me.
> 
>> > @@ -1948,7 +1948,7 @@ steal_suitable_fallback(struct zone *zone, struct page *page,
>> >       if (boost_watermark(zone) && (alloc_flags & ALLOC_KSWAPD))
>> >               set_bit(ZONE_BOOSTED_WATERMARK, &zone->flags);
>> >
>> > -     /* We are not allowed to try stealing from the whole block */
>> > +     /* No point in stealing from the whole block */
>>
>> The original comment actually makes more sense to me. Why is there no
>> point? It could well find enough free+alike pages to steal the
>> block... It's just not allowed to.
> 
> OK so this is basically point 2 from the commit message, maybe I don't
> really understand why this condition is here, and maybe I'm about to
> look stupid.
> 
> Why don't we allow this code to steal the whole block? There wouldn't
> be any functional bug if it did, right? I thought that !whole_block
> just meant "flipping that block would not have any real benefit from a
> fragmentation point of view". So we'd just be doing work and modifying
> data structures for no good reason. Is there some stronger reason I'm
> missing why we really _mustn't_ steal it?

Agreed with your view.

>> I will say, the current code is pretty hard to reason about:
>>
>> On one hand we check the block size statically in can_steal_fallback;
>> on the other hand, we do that majority scan for compatible pages in
>> steal_suitable_fallback(). The effective outcomes are hard to discern,
>> and I'm honestly not convinced they're all intentional.
>>
>> For example, if we're allowed to steal the block because of this in
>> can_steal_fallback():
>>
>>         order >= pageblock_order/2
>>
>> surely, we'll always satisfy this in steal_suitable_fallback()
>>
>>         free_pages + alike_pages >= (1 << (pageblock_order-1)
>>
>> on free_pages alone.
> 
> No, the former is half the _order_ the latter is half the _number of
> pages_. So e.g. we could have order=6 (assuming pageblock_order=10)
> then free_pages might be only 1<<6 which is less than 1<<9.
> 
> Anyway your underlying point that this is confusing is obviously correct!
> 
>> And if the order is less than half a block, we're only allowed an
>> attempt at stealing it if this is true in can_steal_fallback():
>>
>>         start_type == MIGRATE_RECLAIMABLE ||
>>         start_type == MIGRATE_UNMOVABLE
>>
>> So why is the majority scan in steal_suitable_fallback() checking:
>>
>>         if (start_type == MIGRATE_MOVABLE)
>>                 alike_pages = movable_pages
>>
>> Here is how I read the effective rules:
>>
>> - We always steal the block if at least half of it is free.
>>
>> - If less than half is free, but more than half is compatible (free +
>>   alike), we currently do movable -> non-movable conversions.
>>
>>   We don't do non-movable -> movable (won't get to the majority scan).
>>   This seems reasonable to me, as there seems to be little value in
>>   making a new pre-polluted movable block.
>>
>> - However, we do non-movable -> movable conversion if more than half
>>   is free. This is seemingly in conflict with the previous point.
> 
> Hmm I'm not sure I'm seeing the "conflict", I think you just have to
> word it differently it's like:
> 
> - For movable allocations, there's a threshold of the square root of
> the pageblock size (??) before we consider stealing.
> 
> - Otherwise, we steal the block if more than half is compatible.
> 
> - If this threshold prevents us from stealing the whole block, we find
> the page via the smallest-order freelist possible instead of largest.
> 
> Does that seem right to you?
> 
> It feels like that last point has something to do with: if we know in
> advance that we aren't gonna steal the whole block, we wanna avoid
> breaking down a high-order page. But, if the allocation is movable, we
> wouldn't create persistent fragmentation by doing that. So I'm
> realising now that I don't entirely understand this.

Aha! Looks like this is also a leftover from before migratetype hygiene
series that went unnoticed. The logic was, if we're making an unmovable
allocation stealing from a movable block, even if we don't claim the whole
block, at least steal the highest order available. Then more unmovable
allocations can be satisfied from what remains of the high-order split,
before we have to steal from another movable pageblock.

If we're making a movable allocation stealing from an unmovable pageblock,
we don't need to make this buffer, as polluting unmovable pageblocks with
movable allocations is not that bad - they can be compacted later. So we go
for the smallest order we need.

However IIUC this is all moot now. If we don't claim the whole pageblock,
and split a high-order page, the remains of the split will go to the
freelists of the migratetype of the unclaimed pageblock (i.e. movable),
previously (before migratetype hygiene) they would got to the freelists of
the migratetype we want to allocate (unmovable).

So now it makes no sense to want the highest order if we're not claiming the
whole pageblock, we're just causing more fragmentation for no good reason.
We should always do the find_smallest. It would be good to fix this.

>> Then there is compaction, which currently uses only the
>> find_suitable_fallback() subset of the rules. Namely, for kernel
>> allocations, compaction stops as soon as there is an adequately sized
>> fallback. Even if the allocator won't convert the block due to the
>> majority scan. For movable requests, we'll stop if there is half a
>> block to fall back to.
> 
> Not half a block, "square root" of a block. But yeah.
> 
>> I suppose that's reasonable - the old
>> utilization vs. fragmentation debate aside...
>>
>> Did I miss one?
>>
>> We should be able to encode all this more concisely.
>>
>> > @@ -1995,12 +1995,14 @@ steal_suitable_fallback(struct zone *zone, struct page *page,
>> >
>> >  /*
>> >   * Check whether there is a suitable fallback freepage with requested order.
>> > - * If only_stealable is true, this function returns fallback_mt only if
>> > - * we can steal other freepages all together. This would help to reduce
>> > + * Sets *whole_block to instruct the caller whether it should convert a whole
>> > + * pageblock to the returned migratetype.
>> > + * If need_whole_block is true, this function returns fallback_mt only if
>> > + * we would do this whole-block stealing. This would help to reduce
>> >   * fragmentation due to mixed migratetype pages in one pageblock.
>> >   */
>> >  int find_suitable_fallback(struct free_area *area, unsigned int order,
>> > -                     int migratetype, bool only_stealable, bool *can_steal)
>> > +                     int migratetype, bool need_whole_block, bool *whole_block)
>> >  {
>> >       int i;
>> >       int fallback_mt;
>> > @@ -2008,19 +2010,19 @@ int find_suitable_fallback(struct free_area *area, unsigned int order,
>> >       if (area->nr_free == 0)
>> >               return -1;
>> >
>> > -     *can_steal = false;
>> > +     *whole_block = false;
>> >       for (i = 0; i < MIGRATE_PCPTYPES - 1 ; i++) {
>> >               fallback_mt = fallbacks[migratetype][i];
>> >               if (free_area_empty(area, fallback_mt))
>> >                       continue;
>> >
>> > -             if (can_steal_fallback(order, migratetype))
>> > -                     *can_steal = true;
>> > +             if (should_steal_whole_block(order, migratetype))
>> > +                     *whole_block = true;
>> >
>> > -             if (!only_stealable)
>> > +             if (!need_whole_block)
>> >                       return fallback_mt;
>> >
>> > -             if (*can_steal)
>> > +             if (*whole_block)
>> >                       return fallback_mt;
>> >       }
>>
>> This loop is quite awkward, but I think it actually gets more awkward
>> with the new names.
>>
>> Consider this instead:
>>
>>                 *block_claimable = can_claim_block(order, migratetype);
>>                 if (*block_claimable || !need_whole_block)
>>                         return fallback_mt;
> 
> Yeah and even just combining the conditionals makes this much easier
> to follow IMO.
> 
>> Or better yet, inline that function completely. There are no other
>> callers, and consolidating the rules into fewer places would IMO go a
>> long way of making it easier to follow:
>>
>>                 if (order >= pageblock_order/2 ||
>>                     start_mt == MIGRATE_RECLAIMABLE ||
>>                     start_mt == MIGRATE_UNMOVABLE)
>>                         *block_claimable = true;
>>
>>                 if (*block_claimable || !need_whole_block)
>>                         return fallback_mt;
> 
> Sounds good. There might also be some clarity to be gained by fiddling
> around with the comments and polarity of conditions too.

Agreed.

Would it make sense to have only "bool *whole_block" parameter of
find_suitable_fallback? The value the caller initializes it, it means the
current need_whole_block, the value it has upon return it instructs the
caller what to do. It would mean __compact_finished() would no longer pass
an unused parameter.

> E.g. It's confusing that the comment above that first conditional
> talks about returning false for movable pages, but the conditional is
> about returning true for unmovable.
> 
> Anyway I will have a bit more of a think about this.