On 3 Dec 2024, at 9:24, Vlastimil Babka wrote: > On 12/3/24 15:12, David Hildenbrand wrote: >> On 03.12.24 14:55, Vlastimil Babka wrote: >>> On 12/3/24 10:47, David Hildenbrand wrote: >>>> It's all a bit complicated for alloc_contig_range(). For example, we don't >>>> support many flags, so let's start bailing out on unsupported >>>> ones -- ignoring the placement hints, as we are already given the range >>>> to allocate. >>>> >>>> While we currently set cc.gfp_mask, in __alloc_contig_migrate_range() we >>>> simply create yet another GFP mask whereby we ignore the reclaim flags >>>> specify by the caller. That looks very inconsistent. >>>> >>>> Let's clean it up, constructing the gfp flags used for >>>> compaction/migration exactly once. Update the documentation of the >>>> gfp_mask parameter for alloc_contig_range() and alloc_contig_pages(). >>>> >>>> Acked-by: Zi Yan <ziy@xxxxxxxxxx> >>>> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> >>> >>> Reviewed-by: Vlastimil Babka <vbabka@xxxxxxx> >>> >>>> + /* >>>> + * Flags to control page compaction/migration/reclaim, to free up our >>>> + * page range. Migratable pages are movable, __GFP_MOVABLE is implied >>>> + * for them. >>>> + * >>>> + * Traditionally we always had __GFP_HARDWALL|__GFP_RETRY_MAYFAIL set, >>>> + * keep doing that to not degrade callers. >>>> + */ >>> >>> Wonder if we could revisit that eventually. Why limit migration targets by >>> cpuset via __GFP_HARDWALL if we were not called with __GFP_HARDWALL? And why >>> weaken the attempts with __GFP_RETRY_MAYFAIL if we didn't specify it? >> >> See below. >> >>> >>> Unless I'm missing something, cc->gfp is only checked for __GFP_FS and >>> __GFP_NOWARN in few places, so it's mostly migration_target_control the >>> callers could meaningfully influence. >> >> Note the fist change in the file, where we now use the mask instead of coming up >> with another one out of the blue. :) > > I know. What I wanted to say - cc->gfp is on its own only checked in few > places, but now since we also translate it to migration_target_control's > gfp_mask, it's mostly that part the caller might influence with the passed > flags. But we still impose own additions to it, limiting that influence. > >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index ce7589a4ec01..54594cc4f650 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -6294,7 +6294,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc, >> int ret = 0; >> struct migration_target_control mtc = { >> .nid = zone_to_nid(cc->zone), >> - .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL, >> + .gfp_mask = cc->gfp_mask, >> .reason = MR_CONTIG_RANGE, >> }; >> >> GFP_USER contains __GFP_HARDWALL. I am not sure if that matters here, but > > Yeah wonder if GFP_USER was used specifically for that part, or just randomly :) > >> likely the thing we are assuming here is that we are migrating a page, and >> usually, these are user allocation (except maybe balloon and some other non-lru >> movable things). > > Yeah and user allocations obey cpuset and mempolicies etc. But these are > likely somebody elses allocations that were done according to their > policies. With our migration we might be actually violating those, which > probably can't be helped (is at least migration within the same node > preferred? hmm). But it doesn't seem to me that our caller's restrictions > (if those exist, would be enforced by __GFP_HARDWALL) are that relevant for > somebody else's pages? Yeah, I was wondering why current_gfp_context() is used to adjust gfp_mask, since current context might not be relevant. But I see it is used in the original code, so I did not ask. If current context is irrelevant w.r.t the to-be-migrated pages, should current_gfp_context() part be removed? Ideally, to respect the to-be-migrated page’s gfp, we might need to go through rmap to find its corresponding vma and possible task struct, but that seems overly complicated. Best Regards, Yan, Zi