Re: [PATCH] oom: add pending SIGKILL check for chosen victim

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 22 Apr 2013 21:51:38 +0200
Michal Hocko <mhocko@xxxxxxx> wrote:

> On Mon 22-04-13 19:06:24, Sergey Dyasly wrote:
> > Currently, fatal_signal_pending() check is issued only for task that invoked
> > oom killer. Add the same check for oom killer's chosen victim.
> > 
> > This eliminates regression with killing multithreaded processes which was
> > introduced by commit 6b0c81b3be114a93f79bd4c5639ade5107d77c21
> > (mm, oom: reduce dependency on tasklist_lock). When one of threads
> > was oom-killed, other threads could also become victims of oom killer, which
> > can cause an infinite loop.
> > 
> > There is a race with task->thread_group RCU protected list deletion/iteration:
> > now only a reference to a chosen thread of dying threadgroup is held, so when
> > the thread doesn't have PF_EXITING flag yet and dump_header() is called
> > to print info, it already has SIGKILL and can call do_exit(), which removes
> > the thread from the thread_group list. After printing info, oom killer
> > is doing while_each_thread() on this thread and it still has next reference
> > to some other thread, but no other thread has next reference to this one.
> > This causes the infinite loop with tasklist_lock read held.
> 
> I am not sure I understand the race you are describing here.
> release_task calls __exit_signal with tasklist_lock held for write. And
> we are holding the very same lock for reading around while_each_thread
> in oom_kill_process.

Yes, we are holding tasklist_lock when iterating, but the thread can be deleted
from thread_group list _before_ that. In this case, while_each_thread loop exit
condition will never be true.

Imagine the following situation:
Threadgroup with 4 threads: thread_1, thread_2, thread_3, thread_4.

thread_1 is oom killed and SIGKILL is sent to all threads.

allocation --> no memory --> invoke oom killer
oom killer selects thread_2 as victim:


           OOM killer               |              thread_2
                                    |
  oom_kill_process(thread_2)        |
      thread_2 has PF_EXITING? no   |      (but has pending SIGKILL)
      dump_header()                 |
                                    |
                                    |      do_exit()
                                    |          sets PF_EXITING
                                    |          list_del_rcu(thread_group)
                                    |
      read_lock(tasklist_lock)      |
      while_each_thread()           |

Iteration order: thread_2 --> thread_3 --> thread_4 --> thread_3 --> thread_4...
This will never reach thread_2 again and break loop, as result: infinite loop.

-- 
dserrg <dserrg@xxxxxxxxx>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]