From: Feng Tang <feng.tang@xxxxxxxxx> Subject: mm/mempolicy: cleanup nodemask intersection check for oom Patch series "mm/mempolicy: some fix and semantics cleanup", v4. Current memory policy code has some confusing and ambiguous part about MPOL_LOCAL policy, as it is handled as a faked MPOL_PREFERRED one, and there are many places having to distinguish them. Also the nodemask intersection check needs cleanup to be more explicit for OOM use, and handle MPOL_INTERLEAVE correctly. This patchset cleans up these and unifies the parameter sanity check for mbind() and set_mempolicy(). This patch (of 3): mempolicy_nodemask_intersects seem to be a general purpose mempolicy function. In fact it is partially tailored for the OOM purpose instead. The oom proper is the only existing user so rename the function to make that purpose explicit. While at it drop the MPOL_INTERLEAVE as those allocations never has a nodemask defined (see alloc_page_interleave) so this is a dead code and a confusing one because MPOL_INTERLEAVE is a hint rather than a hard requirement so it shouldn't be considered during the OOM. The final code can be reduced to a check for MPOL_BIND which is the only memory policy that is a hard requirement and thus relevant to a constrained OOM logic. [mhocko@xxxxxxxx: changelog edits] Link: https://lkml.kernel.org/r/1622560492-1294-1-git-send-email-feng.tang@xxxxxxxxx Link: https://lkml.kernel.org/r/1622560492-1294-2-git-send-email-feng.tang@xxxxxxxxx Link: https://lkml.kernel.org/r/1622469956-82897-1-git-send-email-feng.tang@xxxxxxxxx Link: https://lkml.kernel.org/r/1622469956-82897-2-git-send-email-feng.tang@xxxxxxxxx Signed-off-by: Feng Tang <feng.tang@xxxxxxxxx> Suggested-by: Michal Hocko <mhocko@xxxxxxxx> Acked-by: Michal Hocko <mhocko@xxxxxxxx> Cc: Andi Kleen <ak@xxxxxxxxxxxxxxx> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> Cc: Ben Widawsky <ben.widawsky@xxxxxxxxx> Cc: Dan Williams <dan.j.williams@xxxxxxxxx> Cc: Dave Hansen <dave.hansen@xxxxxxxxx> Cc: David Rientjes <rientjes@xxxxxxxxxx> Cc: Huang Ying <ying.huang@xxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Cc: Randy Dunlap <rdunlap@xxxxxxxxxxxxx> Cc: Vlastimil Babka <vbabka@xxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- include/linux/mempolicy.h | 2 +- mm/mempolicy.c | 34 +++++++++------------------------- mm/oom_kill.c | 2 +- 3 files changed, 11 insertions(+), 27 deletions(-) --- a/include/linux/mempolicy.h~mm-mempolicy-cleanup-nodemask-intersection-check-for-oom +++ a/include/linux/mempolicy.h @@ -150,7 +150,7 @@ extern int huge_node(struct vm_area_stru unsigned long addr, gfp_t gfp_flags, struct mempolicy **mpol, nodemask_t **nodemask); extern bool init_nodemask_of_mempolicy(nodemask_t *mask); -extern bool mempolicy_nodemask_intersects(struct task_struct *tsk, +extern bool mempolicy_in_oom_domain(struct task_struct *tsk, const nodemask_t *mask); extern nodemask_t *policy_nodemask(gfp_t gfp, struct mempolicy *policy); --- a/mm/mempolicy.c~mm-mempolicy-cleanup-nodemask-intersection-check-for-oom +++ a/mm/mempolicy.c @@ -2094,16 +2094,16 @@ bool init_nodemask_of_mempolicy(nodemask #endif /* - * mempolicy_nodemask_intersects + * mempolicy_in_oom_domain * - * If tsk's mempolicy is "default" [NULL], return 'true' to indicate default - * policy. Otherwise, check for intersection between mask and the policy - * nodemask for 'bind' or 'interleave' policy. For 'preferred' or 'local' - * policy, always return true since it may allocate elsewhere on fallback. + * If tsk's mempolicy is "bind", check for intersection between mask and + * the policy nodemask. Otherwise, return true for all other policies + * including "interleave", as a tsk with "interleave" policy may have + * memory allocated from all nodes in system. * * Takes task_lock(tsk) to prevent freeing of its mempolicy. */ -bool mempolicy_nodemask_intersects(struct task_struct *tsk, +bool mempolicy_in_oom_domain(struct task_struct *tsk, const nodemask_t *mask) { struct mempolicy *mempolicy; @@ -2111,29 +2111,13 @@ bool mempolicy_nodemask_intersects(struc if (!mask) return ret; + task_lock(tsk); mempolicy = tsk->mempolicy; - if (!mempolicy) - goto out; - - switch (mempolicy->mode) { - case MPOL_PREFERRED: - /* - * MPOL_PREFERRED and MPOL_F_LOCAL are only preferred nodes to - * allocate from, they may fallback to other nodes when oom. - * Thus, it's possible for tsk to have allocated memory from - * nodes in mask. - */ - break; - case MPOL_BIND: - case MPOL_INTERLEAVE: + if (mempolicy && mempolicy->mode == MPOL_BIND) ret = nodes_intersects(mempolicy->v.nodes, *mask); - break; - default: - BUG(); - } -out: task_unlock(tsk); + return ret; } --- a/mm/oom_kill.c~mm-mempolicy-cleanup-nodemask-intersection-check-for-oom +++ a/mm/oom_kill.c @@ -104,7 +104,7 @@ static bool oom_cpuset_eligible(struct t * mempolicy intersects current, otherwise it may be * needlessly killed. */ - ret = mempolicy_nodemask_intersects(tsk, mask); + ret = mempolicy_in_oom_domain(tsk, mask); } else { /* * This is not a mempolicy constrained oom, so only _