On Mon, 14 Mar 2011, Oleg Nesterov wrote: > select_bad_process() assumes that a TIF_MEMDIE process should go away. > But it can only go away it its parent does wait(). Change this check to > ignore the TIF_MEMDIE zombies. > The equivalent of this change would be to set TIF_MEMDIE for all threads in a thread group when choosing a process to kill; as we've already discussed in your first series of patches, that has the risk of fully depleting memory reserves and causing the kernel the deadlock. We want to limit TIF_MEMDIE to an oom killed task or to current when it is responding to a SIGKILL or already in the exit path because we know it's exiting and without memory reserves it may never exit. This patch is even more concerning, however, because select_bad_process() isn't even guaranteed to select a thread from the same thread group this time. > Note: this is _not_ enough. Just a minimal fix. > > Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx> > --- > > mm/oom_kill.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > --- 38/mm/oom_kill.c~2_tif_memdie_zombie 2011-03-14 18:51:49.000000000 +0100 > +++ 38/mm/oom_kill.c 2011-03-14 18:52:39.000000000 +0100 > @@ -311,7 +311,8 @@ static struct task_struct *select_bad_pr > * blocked waiting for another task which itself is waiting > * for memory. Is there a better alternative? > */ > - if (test_tsk_thread_flag(p, TIF_MEMDIE)) > + if (test_tsk_thread_flag(p, TIF_MEMDIE) && > + !p->exit_state && thread_group_empty(p)) > return ERR_PTR(-1UL); > > /* > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>