On Mon 02-08-21 16:11:30, Feng Tang wrote: > On Fri, Jul 30, 2021 at 03:18:40PM +0800, Tang, Feng wrote: > [snip] > > > > One thing is, it's possible that 'nd' is not set in the preferred > > > > nodemask. > > > > > > Yes, and there shouldn't be any problem with that. The given node is > > > only used to get the respective zonelist (order distance ordered list of > > > zones to try). get_page_from_freelist will then use the preferred node > > > mask to filter this zone list. Is that more clear now? > > > > Yes, from the code, the policy_node() is always coupled with > > policy_nodemask(), which secures the 'nodemask' limit. Thanks for > > the clarification! > > Hi Michal, > > To ensure the nodemask limit, the policy_nodemask() also needs some > change to return the nodemask for 'prefer-many' policy, so here is a > updated 1/6 patch, which mainly changes the node/nodemask selection > for 'prefer-many' policy, could you review it? thanks! right, I have mixed it with get_policy_nodemask > @@ -1875,8 +1897,13 @@ static int apply_policy_zone(struct mempolicy *policy, enum zone_type zone) > */ > nodemask_t *policy_nodemask(gfp_t gfp, struct mempolicy *policy) > { > - /* Lower zones don't get a nodemask applied for MPOL_BIND */ > - if (unlikely(policy->mode == MPOL_BIND) && > + int mode = policy->mode; > + > + /* > + * Lower zones don't get a nodemask applied for 'bind' and > + * 'prefer-many' policies > + */ > + if (unlikely(mode == MPOL_BIND || mode == MPOL_PREFERRED_MANY) && > apply_policy_zone(policy, gfp_zone(gfp)) && > cpuset_nodemask_valid_mems_allowed(&policy->nodes)) > return &policy->nodes; Isn't this just too cryptic? Why didn't you simply if (mode == MPOL_PREFERRED_MANY) return &policy->mode; in addition to the existing code? I mean why would you even care about cpusets? Those are handled at the page allocator layer and will further filter the given nodemask. -- Michal Hocko SUSE Labs