Re: [patch v2] mm, oom: do not schedule if current has been killed

David Rientjes <rientjes@xxxxxxxxxx> · Mon, 18 Jun 2012 23:26:47 -0700 (PDT)

On Tue, 19 Jun 2012, KOSAKI Motohiro wrote:

> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -746,10 +746,11 @@ out:
> >        read_unlock(&tasklist_lock);
> >
> >        /*
> > -        * Give "p" a good chance of killing itself before we
> > +        * Give "p" a good chance of exiting before we
> >         * retry to allocate memory unless "p" is current
> >         */
> > -       if (killed && !test_thread_flag(TIF_MEMDIE))
> > +       if (killed && !fatal_signal_pending(current) &&
> > +                     !(current->flags & PF_EXITING))
> >                schedule_timeout_uninterruptible(1);
> >  }
> 
> Why don't check gfp_flags? I think the rule is,
> 
> 1) a thread of newly marked as TIF_MEMDIE
>     -> now it has a capability to access reseve memory. let's immediately retry.
> 2) allocation for GFP_HIGHUSER_MOVABLE
>     -> we can fail to allocate it safely. let's immediately fail.
>         (I suspect we need to change page allocator too)
> 3) GFP_KERNEL and PF_EXITING
>     -> don't retry immediately. It shall fail again. let's wait until
> killed process
>         is exited.
> 

The killed process may exit but it does not guarantee that its memory will 
be freed if it's shared with current.  This is the case that the patch is 
addressing, where right now we unnecessarily schedule if current has been 
killed or is already along the exit path.  We want to retry as soon as 
possible so that either the allocation now succeeds or we can recall the 
oom killer as soon as possible and get TIF_MEMDIE set because we have a 
fatal signal so current may exit in a timely way as well.  The point is 
that if current has either a SIGKILL or is already exiting as it returns 
from the oom killer, it does no good to continue to stall and prevent that 
memory freeing.