Re: [PATCH 6/6] mm, oom: fortify task_will_free_mem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Michal Hocko wrote:
> On Wed 01-06-16 00:03:53, Tetsuo Handa wrote:
> > Michal Hocko wrote:
> > > task_will_free_mem is rather weak. It doesn't really tell whether
> > > the task has chance to drop its mm. 98748bd72200 ("oom: consider
> > > multi-threaded tasks in task_will_free_mem") made a first step
> > > into making it more robust for multi-threaded applications so now we
> > > know that the whole process is going down and probably drop the mm.
> > > 
> > > This patch builds on top for more complex scenarios where mm is shared
> > > between different processes - CLONE_VM without CLONE_THREAD resp
> > > CLONE_SIGHAND, or in kernel use_mm().
> > > 
> > > Make sure that all processes sharing the mm are killed or exiting. This
> > > will allow us to replace try_oom_reaper by wake_oom_reaper. Therefore
> > > all paths which bypass the oom killer are now reapable and so they
> > > shouldn't lock up the oom killer.
> > 
> > Really? The can_oom_reap variable was not removed before this patch.
> > It means that oom_kill_process() might fail to call wake_oom_reaper()
> > while setting TIF_MEMDIE to one of threads using that mm_struct.
> > If use_mm() or global init keeps that mm_struct not OOM reapable, other
> > threads sharing that mm_struct will get task_will_free_mem() == false,
> > won't it?
> > 
> > How is it guaranteed that task_will_free_mem() == false && oom_victims > 0
> > shall not lock up the OOM killer?
> 
> But this patch is talking about task_will_free_mem == true. Is the
> description confusing? Should I reword the changelog?

The situation I'm talking about is

  (1) out_of_memory() is called.
  (2) select_bad_process() is called because task_will_free_mem(current) == false.
  (3) oom_kill_process() is called because select_bad_process() chose a victim.
  (4) oom_kill_process() sets TIF_MEMDIE on that victim.
  (5) oom_kill_process() fails to call wake_oom_reaper() because that victim's
      memory was shared by use_mm() or global init.
  (6) other !TIF_MEMDIE threads sharing that victim's memory call out_of_memory().
  (7) select_bad_process() is called because task_will_free_mem(current) == false.
  (8) oom_scan_process_thread() returns OOM_SCAN_ABORT because it finds TIF_MEMDIE
      set at (4).
  (9) other !TIF_MEMDIE threads sharing that victim's memory fail to get TIF_MEMDIE.
  (10) How other !TIF_MEMDIE threads sharing that victim's memory will release
       that memory?

I'm fine with task_will_free_mem(current) == true case. My question is that
"doesn't this patch break task_will_free_mem(current) == false case when there is
already TIF_MEMDIE thread" ?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]