Michal Hocko wrote: > On Wed 01-06-16 00:03:53, Tetsuo Handa wrote: > > Michal Hocko wrote: > > > task_will_free_mem is rather weak. It doesn't really tell whether > > > the task has chance to drop its mm. 98748bd72200 ("oom: consider > > > multi-threaded tasks in task_will_free_mem") made a first step > > > into making it more robust for multi-threaded applications so now we > > > know that the whole process is going down and probably drop the mm. > > > > > > This patch builds on top for more complex scenarios where mm is shared > > > between different processes - CLONE_VM without CLONE_THREAD resp > > > CLONE_SIGHAND, or in kernel use_mm(). > > > > > > Make sure that all processes sharing the mm are killed or exiting. This > > > will allow us to replace try_oom_reaper by wake_oom_reaper. Therefore > > > all paths which bypass the oom killer are now reapable and so they > > > shouldn't lock up the oom killer. > > > > Really? The can_oom_reap variable was not removed before this patch. > > It means that oom_kill_process() might fail to call wake_oom_reaper() > > while setting TIF_MEMDIE to one of threads using that mm_struct. > > If use_mm() or global init keeps that mm_struct not OOM reapable, other > > threads sharing that mm_struct will get task_will_free_mem() == false, > > won't it? > > > > How is it guaranteed that task_will_free_mem() == false && oom_victims > 0 > > shall not lock up the OOM killer? > > But this patch is talking about task_will_free_mem == true. Is the > description confusing? Should I reword the changelog? The situation I'm talking about is (1) out_of_memory() is called. (2) select_bad_process() is called because task_will_free_mem(current) == false. (3) oom_kill_process() is called because select_bad_process() chose a victim. (4) oom_kill_process() sets TIF_MEMDIE on that victim. (5) oom_kill_process() fails to call wake_oom_reaper() because that victim's memory was shared by use_mm() or global init. (6) other !TIF_MEMDIE threads sharing that victim's memory call out_of_memory(). (7) select_bad_process() is called because task_will_free_mem(current) == false. (8) oom_scan_process_thread() returns OOM_SCAN_ABORT because it finds TIF_MEMDIE set at (4). (9) other !TIF_MEMDIE threads sharing that victim's memory fail to get TIF_MEMDIE. (10) How other !TIF_MEMDIE threads sharing that victim's memory will release that memory? I'm fine with task_will_free_mem(current) == true case. My question is that "doesn't this patch break task_will_free_mem(current) == false case when there is already TIF_MEMDIE thread" ? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>