On Wed, Sep 01, 2021 at 04:05:40PM +0200, Michal Hocko wrote: [SNIP] > > This looks better than the previous attempt. It would be still better to > solve this at the page allocator layer. The slowpath is already doing > this for the nodemask. E.g. > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index eeb3a9cb36bb..a3193134540d 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4929,6 +4929,17 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, > if (!ac->preferred_zoneref->zone) > goto nopage; > > + /* > + * Check for insane configurations where the cpuset doesn't contain any suitable > + * zone to satisfy the request - e.g. kernel allocations from MOVABLE nodes only > + */ > + if (cpusets_enabled() && (gfp_mask & __GFP_HARDWALL)) { > + struct zoneref *z = first_zones_zonelist(ac->zonelist, ac->highest_zoneidx, > + &cpuset_current_mems_allowed); > + if (!z->zone) > + goto nopage; > + } > + > if (alloc_flags & ALLOC_KSWAPD) > wake_all_kswapds(order, gfp_mask, ac); Thanks for the suggestion! It dose bail out early skipping the kswapd, direct reclaim and compaction. I also looked at prepare_alloc_pages() which does some cpuset check and zone initialization, but I'd better leave it alone as it's in a real hot path, while here is in slowpath anyway. Will run some page fault benchmark cases with this patch. Thanks, Feng > if this is seen as an additional overhead for an insane configuration > then we can add insane_cpusets_enabled() which would be a static branch > enabled when somebody actually tries to configure movable only cpusets > or potentially other dubious usage. > -- > Michal Hocko > SUSE Labs