Re: [PATCH] mm,oom_reaper: don't call mmput_async() without atomic_inc_not_zero()

Michal Hocko <mhocko@xxxxxxx> · Thu, 2 Jun 2016 08:48:04 +0200

On Wed 01-06-16 15:53:13, Andrew Morton wrote:
> On Sat, 28 May 2016 17:16:05 +0900 Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
> 
> > Commit e2fe14564d3316d1 ("oom_reaper: close race with exiting task")
> > reduced frequency of needlessly selecting next OOM victim, but was
> > calling mmput_async() when atomic_inc_not_zero() failed.
> 
> Changelog fail.
> 
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -478,6 +478,7 @@ static bool __oom_reap_task(struct task_struct *tsk)
> >  	mm = p->mm;
> >  	if (!atomic_inc_not_zero(&mm->mm_users)) {
> >  		task_unlock(p);
> > +		mm = NULL;
> >  		goto unlock_oom;
> >  	}
> 
> This looks like a pretty fatal bug.  I assume the result of hitting
> that race will be a kernel crash, yes?

Yes it is a nasty bug. It was (re)introduced by the final touch to the
goto paths. And yes it can cause a crash.

> Is it even possible to hit that race? 

It is, we can have a concurrent mmput followed by mmdrop.

> find_lock_task_mm() takes some
> care to prevent a NULL ->mm.  But I guess a concurrent mmput() doesn't
> require task_lock().  Kinda makes me wonder what's the point in even
> having find_lock_task_mm() if its guarantee on ->mm is useless...

find_lock_task_mm makes sure that the mm stays non-NULL while we hold
the lock. We have to do all the necessary pinning while holding it.
atomic_inc_not_zero will guarantee we are not racing with the finall
mmput.

Does that make more sense now?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>