From: Oleg Nesterov <oleg@xxxxxxxxxx> Almost all ->mm == NUL checks in oom_kill.c are wrong. The current code assumes that the task without ->mm has already released its memory and ignores the process. However this is not necessarily true when this process is multithreaded, other live sub-threads can use this ->mm. - Remove the "if (!p->mm)" check in select_bad_process(), it is just wrong. - Add the new helper, find_lock_task_mm(), which finds the live thread which uses the memory and takes task_lock() to pin ->mm - change oom_badness() to use this helper instead of just checking ->mm != NULL. - As David pointed out, select_bad_process() must never choose the task without ->mm, but no matter what oom_badness() returns the task can be chosen if nothing else has been found yet. Change oom_badness() to return int, change it to return -1 if find_lock_task_mm() fails, and change select_bad_process() to check points >= 0. Note! This patch is not enough, we need more changes. - oom_badness() was fixed, but oom_kill_task() still ignores the task without ->mm - oom_forkbomb_penalty() should use find_lock_task_mm() too, and it also needs other changes to actually find the first first-descendant children This will be addressed later. Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx> Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx> --- mm/oom_kill.c | 37 +++++++++++++++++++------------------ 1 files changed, 19 insertions(+), 18 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -95,6 +95,20 @@ static void check_panic_on_oom(enum oom_constraint constraint, gfp_t gfp_mask, return false; } +static struct task_struct *find_lock_task_mm(struct task_struct *p) +{ + struct task_struct *t = p; + + do { + task_lock(t); + if (likely(t->mm)) + return t; + task_unlock(t); + } while_each_thread(p, t); + + return NULL; +} + /* * Tasks that fork a very large number of children with seperate address spaces * may be the result of a bug, user error, malicious applications, or even those @@ -164,7 +178,6 @@ static unsigned long oom_forkbomb_penalty(struct task_struct *tsk) */ unsigned int oom_badness(struct task_struct *p, unsigned long totalpages) { - struct mm_struct *mm; int points; /* @@ -181,12 +194,9 @@ unsigned int oom_badness(struct task_struct *p, unsigned long totalpages) if (p->flags & PF_OOM_ORIGIN) return 1000; - task_lock(p); - mm = p->mm; - if (!mm) { - task_unlock(p); + p = find_lock_task_mm(p); + if (!p) return 0; - } /* * The memory controller may have a limit of 0 bytes, so avoid a divide @@ -199,8 +209,8 @@ unsigned int oom_badness(struct task_struct *p, unsigned long totalpages) * The baseline for the badness score is the proportion of RAM that each * task's rss and swap space use. */ - points = (get_mm_rss(mm) + get_mm_counter(mm, MM_SWAPENTS)) * 1000 / - totalpages; + points = (get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS)) * + 1000 / totalpages; task_unlock(p); points += oom_forkbomb_penalty(p); @@ -357,17 +367,8 @@ static struct task_struct *select_bad_process(unsigned int *ppoints, *ppoints = 1000; } - /* - * skip kernel threads and tasks which have already released - * their mm. - */ - if (!p->mm) - continue; - if (p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN) - continue; - points = oom_badness(p, totalpages); - if (points > *ppoints || !chosen) { + if (points > *ppoints) { chosen = p; *ppoints = points; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>