On Wed 01-06-16 15:53:13, Andrew Morton wrote: > On Sat, 28 May 2016 17:16:05 +0900 Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: > > > Commit e2fe14564d3316d1 ("oom_reaper: close race with exiting task") > > reduced frequency of needlessly selecting next OOM victim, but was > > calling mmput_async() when atomic_inc_not_zero() failed. > > Changelog fail. > > > --- a/mm/oom_kill.c > > +++ b/mm/oom_kill.c > > @@ -478,6 +478,7 @@ static bool __oom_reap_task(struct task_struct *tsk) > > mm = p->mm; > > if (!atomic_inc_not_zero(&mm->mm_users)) { > > task_unlock(p); > > + mm = NULL; > > goto unlock_oom; > > } > > This looks like a pretty fatal bug. I assume the result of hitting > that race will be a kernel crash, yes? Yes it is a nasty bug. It was (re)introduced by the final touch to the goto paths. And yes it can cause a crash. > Is it even possible to hit that race? It is, we can have a concurrent mmput followed by mmdrop. > find_lock_task_mm() takes some > care to prevent a NULL ->mm. But I guess a concurrent mmput() doesn't > require task_lock(). Kinda makes me wonder what's the point in even > having find_lock_task_mm() if its guarantee on ->mm is useless... find_lock_task_mm makes sure that the mm stays non-NULL while we hold the lock. We have to do all the necessary pinning while holding it. atomic_inc_not_zero will guarantee we are not racing with the finall mmput. Does that make more sense now? -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>