On Mon, 22 Apr 2013 21:51:38 +0200 Michal Hocko <mhocko@xxxxxxx> wrote: > On Mon 22-04-13 19:06:24, Sergey Dyasly wrote: > > Currently, fatal_signal_pending() check is issued only for task that invoked > > oom killer. Add the same check for oom killer's chosen victim. > > > > This eliminates regression with killing multithreaded processes which was > > introduced by commit 6b0c81b3be114a93f79bd4c5639ade5107d77c21 > > (mm, oom: reduce dependency on tasklist_lock). When one of threads > > was oom-killed, other threads could also become victims of oom killer, which > > can cause an infinite loop. > > > > There is a race with task->thread_group RCU protected list deletion/iteration: > > now only a reference to a chosen thread of dying threadgroup is held, so when > > the thread doesn't have PF_EXITING flag yet and dump_header() is called > > to print info, it already has SIGKILL and can call do_exit(), which removes > > the thread from the thread_group list. After printing info, oom killer > > is doing while_each_thread() on this thread and it still has next reference > > to some other thread, but no other thread has next reference to this one. > > This causes the infinite loop with tasklist_lock read held. > > I am not sure I understand the race you are describing here. > release_task calls __exit_signal with tasklist_lock held for write. And > we are holding the very same lock for reading around while_each_thread > in oom_kill_process. Yes, we are holding tasklist_lock when iterating, but the thread can be deleted from thread_group list _before_ that. In this case, while_each_thread loop exit condition will never be true. Imagine the following situation: Threadgroup with 4 threads: thread_1, thread_2, thread_3, thread_4. thread_1 is oom killed and SIGKILL is sent to all threads. allocation --> no memory --> invoke oom killer oom killer selects thread_2 as victim: OOM killer | thread_2 | oom_kill_process(thread_2) | thread_2 has PF_EXITING? no | (but has pending SIGKILL) dump_header() | | | do_exit() | sets PF_EXITING | list_del_rcu(thread_group) | read_lock(tasklist_lock) | while_each_thread() | Iteration order: thread_2 --> thread_3 --> thread_4 --> thread_3 --> thread_4... This will never reach thread_2 again and break loop, as result: infinite loop. -- dserrg <dserrg@xxxxxxxxx> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>