Hi, Balbir. On Mon, Jun 7, 2010 at 9:58 PM, Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> wrote: > * David Rientjes <rientjes@xxxxxxxxxx> [2010-06-06 15:34:03]: > >> From: Oleg Nesterov <oleg@xxxxxxxxxx> >> >> Almost all ->mm == NUL checks in oom_kill.c are wrong. > > typo should be NULL > >> >> The current code assumes that the task without ->mm has already >> released its memory and ignores the process. However this is not >> necessarily true when this process is multithreaded, other live >> sub-threads can use this ->mm. >> >> - Remove the "if (!p->mm)" check in select_bad_process(), it is >> just wrong. >> >> - Add the new helper, find_lock_task_mm(), which finds the live >> thread which uses the memory and takes task_lock() to pin ->mm >> >> - change oom_badness() to use this helper instead of just checking >> ->mm != NULL. >> >> - As David pointed out, select_bad_process() must never choose the >> task without ->mm, but no matter what oom_badness() returns the >> task can be chosen if nothing else has been found yet. >> >> Change oom_badness() to return int, change it to return -1 if >> find_lock_task_mm() fails, and change select_bad_process() to >> check points >= 0. >> >> Note! This patch is not enough, we need more changes. >> >> - oom_badness() was fixed, but oom_kill_task() still ignores >> the task without ->mm >> >> - oom_forkbomb_penalty() should use find_lock_task_mm() too, >> and it also needs other changes to actually find the first >> first-descendant children >> >> This will be addressed later. >> >> [kosaki.motohiro@xxxxxxxxxxxxxx: use in badness(), __oom_kill_task()] >> Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx> >> Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx> >> --- >> mm/oom_kill.c | 74 +++++++++++++++++++++++++++++++++------------------------ >> 1 files changed, 43 insertions(+), 31 deletions(-) >> >> diff --git a/mm/oom_kill.c b/mm/oom_kill.c >> --- a/mm/oom_kill.c >> +++ b/mm/oom_kill.c >> @@ -52,6 +52,20 @@ static int has_intersects_mems_allowed(struct task_struct *tsk) >> return 0; >> } >> >> +static struct task_struct *find_lock_task_mm(struct task_struct *p) >> +{ >> + struct task_struct *t = p; >> + >> + do { >> + task_lock(t); >> + if (likely(t->mm)) >> + return t; >> + task_unlock(t); >> + } while_each_thread(p, t); >> + >> + return NULL; >> +} >> + > > Even if we miss this mm via p->mm, won't for_each_process actually > catch it? Are you suggesting that the main thread could have detached > the mm and a thread might still have it mapped? Yes. Although main thread detach mm, sub-thread still may have the mm. As you have confused, I think this function name isn't good. So I suggested following as. http://lkml.org/lkml/2010/6/2/325 Anyway, It does make sense to me. -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href