Re: [patch 05/18] oom: give current access to memory reserves if it has been killed

David Rientjes <rientjes@xxxxxxxxxx> · Tue, 8 Jun 2010 11:47:17 -0700 (PDT)

On Tue, 8 Jun 2010, KOSAKI Motohiro wrote:

> > It's possible to livelock the page allocator if a thread has mm->mmap_sem
> > and fails to make forward progress because the oom killer selects another
> > thread sharing the same ->mm to kill that cannot exit until the semaphore
> > is dropped.
> > 
> > The oom killer will not kill multiple tasks at the same time; each oom
> > killed task must exit before another task may be killed.  Thus, if one
> > thread is holding mm->mmap_sem and cannot allocate memory, all threads
> > sharing the same ->mm are blocked from exiting as well.  In the oom kill
> > case, that means the thread holding mm->mmap_sem will never free
> > additional memory since it cannot get access to memory reserves and the
> > thread that depends on it with access to memory reserves cannot exit
> > because it cannot acquire the semaphore.  Thus, the page allocators
> > livelocks.
> > 
> > When the oom killer is called and current happens to have a pending
> > SIGKILL, this patch automatically gives it access to memory reserves and
> > returns.  Upon returning to the page allocator, its allocation will
> > hopefully succeed so it can quickly exit and free its memory.  If not, the
> > page allocator will fail the allocation if it is not __GFP_NOFAIL.
> > 
> > Acked-by: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
> > Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
> > Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx>
> > ---
> >  mm/oom_kill.c |   10 ++++++++++
> >  1 files changed, 10 insertions(+), 0 deletions(-)
> > 
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -650,6 +650,16 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
> >  		/* Got some memory back in the last second. */
> >  		return;
> >  
> > +	/*
> > +	 * If current has a pending SIGKILL, then automatically select it.  The
> > +	 * goal is to allow it to allocate so that it may quickly exit and free
> > +	 * its memory.
> > +	 */
> > +	if (fatal_signal_pending(current)) {
> > +		set_thread_flag(TIF_MEMDIE);
> > +		return;
> > +	}
> > +
> >  	if (sysctl_panic_on_oom == 2) {
> >  		dump_header(NULL, gfp_mask, order, NULL);
> >  		panic("out of memory. Compulsory panic_on_oom is selected.\n");
> 
> Sorry, I had found this patch works incorrect. I don't pulled.
> 

You're taking back your ack?

Why does this not work?  It's not killing a potentially immune task, the 
task is already dying.  We're simply giving it access to memory reserves 
so that it may quickly exit and die.  OOM_DISABLE does not imply that a 
task cannot exit on its own or be killed by another application or user, 
we simply don't want to needlessly kill another task when current is dying 
in the first place without being able to allocate memory.

Please reconsider your thought.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>