On Tue, 8 Mar 2011, Oleg Nesterov wrote: > > > By iterating over threads instead, it is possible to detect threads that > > > are exiting and nominate them for oom kill so they get access to memory > > > reserves. > > > > In fact, PF_EXITING is a sing of *THREAD* exiting, not process. Therefore > > PF_EXITING is not a sign of memory freeing in nearly future. If other > > CPUs don't try to free memory, prevent oom and waiting makes deadlock. > > I agree. I don't understand this patch. > Using for_each_process() does not consider threads that have failed to exit after the oom killed parent and, thus, we select another innocent task to kill when we're really just waiting for those threads to exit (and perhaps they need memory reserves in the exit path) or, in the worst case, panic if there is nothing else eligible. The end result is that without this patch, we sometimes unnecessarily panic (and "sometimes" is defined as "many machines" for us) when nothing else is eligible for kill within an oom cpuset yet doing a do_each_thread() over that cpuset shows threads of previously oom killed parent that have yet to exit. > > > @@ -324,7 +324,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints, > > > * the process of exiting and releasing its resources. > > > * Otherwise we could get an easy OOM deadlock. > > > */ > > > - if (thread_group_empty(p) && (p->flags & PF_EXITING) && p->mm) { > > > + if ((p->flags & PF_EXITING) && p->mm) { > > The previous check was not perfect, we know this. > > But with this patch applied, the simple program below disables oom-killer > completely. select_bad_process() can never succeed. > The program illustrates a problem that shouldn't be fixed in select_bad_process() but rather in oom_kill_process() when choosing an eligible child of the selected task to kill in place of its parent. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>