Michal Hocko wrote: > On Sat 25-06-16 01:19:12, Tetsuo Handa wrote: > > Michal Hocko wrote: > [...] > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > > index 4c21f744daa6..97be9324a58b 100644 > > > --- a/mm/oom_kill.c > > > +++ b/mm/oom_kill.c > > > @@ -671,6 +671,22 @@ void mark_oom_victim(struct task_struct *tsk) > > > /* OOM killer might race with memcg OOM */ > > > if (test_and_set_tsk_thread_flag(tsk, TIF_MEMDIE)) > > > return; > > > +#ifndef CONFIG_MMU > > > + /* > > > + * we shouldn't risk setting TIF_MEMDIE on a task which has passed its > > > + * exit_mm task->mm = NULL and exit_oom_victim otherwise it could > > > + * theoretically keep its TIF_MEMDIE for ever while waiting for a parent > > > + * to get it out of zombie state. MMU doesn't have this problem because > > > + * it has the oom_reaper to clear the flag asynchronously. > > > + */ > > > + task_lock(tsk); > > > + if (!tsk->mm) { > > > + clear_tsk_thread_flag(tsk, TIF_MEMDIE); > > > + task_unlock(tsk); > > > + return; > > > + } > > > + taks_unlock(tsk); > > > > This makes mark_oom_victim(tsk) for tsk->mm == NULL a no-op unless tsk is > > currently doing memory allocation. And it is possible that tsk is blocked > > waiting for somebody else's memory allocation after returning from > > exit_mm() from do_exit(), isn't it? Then, how is this better than current > > code (i.e. sets TIF_MEMDIE to a mm-less thread group leader)? > > Well, the whole point of the check is to not set the flag after we > could have passed exit_mm->exit_oom_victim and keep it for the rest of > (unbounded) victim life as there is nothing else to do so. OK. Based on commit 3da88fb3bacfaa33 ("mm, oom: move GFP_NOFS check to out_of_memory") and an assumption that any OOM-killed thread shall eventually win the mutex_trylock(&oom_lock) competition in __alloc_pages_may_oom() no matter how disturbing factors (e.g. scheduling priority) delay OOM-killed threads, you prefer asking each OOM-killed thread to get TIF_MEMDIE via if (current->mm && task_will_free_mem(current)) shortcut in out_of_memory() by keeping if (task_will_free_mem(p)) shortcut in oom_kill_process() a no-op. Yes, it should be harmless. But I prefer not to wait for each OOM-killed thread to win the mutex_trylock(&oom_lock) competition in __alloc_pages_may_oom(). Setting TIF_MEMDIE at if (task_will_free_mem(p)) shortcut in oom_kill_process() can save somebody which got TIF_MEMDIE from participating in the mutex_trylock(&oom_lock) competition which is needed for calling if (current->mm && task_will_free_mem(current)) shortcut in out_of_memory(). > If the tsk is waiting for something then we are screwed same way we were > before. Or have I missed your point? > -- > Michal Hocko > SUSE Labs > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>