On Mon, Aug 02, 2021 at 01:14:29PM +0200, Michal Hocko wrote: > On Mon 02-08-21 16:11:30, Feng Tang wrote: > > On Fri, Jul 30, 2021 at 03:18:40PM +0800, Tang, Feng wrote: > > [snip] > > > > > One thing is, it's possible that 'nd' is not set in the preferred > > > > > nodemask. > > > > > > > > Yes, and there shouldn't be any problem with that. The given node is > > > > only used to get the respective zonelist (order distance ordered list of > > > > zones to try). get_page_from_freelist will then use the preferred node > > > > mask to filter this zone list. Is that more clear now? > > > > > > Yes, from the code, the policy_node() is always coupled with > > > policy_nodemask(), which secures the 'nodemask' limit. Thanks for > > > the clarification! > > > > Hi Michal, > > > > To ensure the nodemask limit, the policy_nodemask() also needs some > > change to return the nodemask for 'prefer-many' policy, so here is a > > updated 1/6 patch, which mainly changes the node/nodemask selection > > for 'prefer-many' policy, could you review it? thanks! > > right, I have mixed it with get_policy_nodemask > > > @@ -1875,8 +1897,13 @@ static int apply_policy_zone(struct mempolicy *policy, enum zone_type zone) > > */ > > nodemask_t *policy_nodemask(gfp_t gfp, struct mempolicy *policy) > > { > > - /* Lower zones don't get a nodemask applied for MPOL_BIND */ > > - if (unlikely(policy->mode == MPOL_BIND) && > > + int mode = policy->mode; > > + > > + /* > > + * Lower zones don't get a nodemask applied for 'bind' and > > + * 'prefer-many' policies > > + */ > > + if (unlikely(mode == MPOL_BIND || mode == MPOL_PREFERRED_MANY) && > > apply_policy_zone(policy, gfp_zone(gfp)) && > > cpuset_nodemask_valid_mems_allowed(&policy->nodes)) > > return &policy->nodes; > > Isn't this just too cryptic? Why didn't you simply > if (mode == MPOL_PREFERRED_MANY) > return &policy->mode; > > in addition to the existing code? I mean why would you even care about > cpusets? Those are handled at the page allocator layer and will further > filter the given nodemask. Ok, I will follow your suggestion and keep 'bind' handling unchanged. And to be honest, I don't fully understand the current handling for 'bind' policy, will the returning NULL for 'bind' policy open a sideway for the strict 'bind' limit. Thanks, Feng > -- > Michal Hocko > SUSE Labs