Re: [patch] oom: prevent unnecessary oom kills or kernel panics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 3 Mar 2011, KOSAKI Motohiro wrote:

> > This patch revents unnecessary oom kills or kernel panics by reverting
> > two commits:
> > 
> > 	495789a5 (oom: make oom_score to per-process value)
> > 	cef1d352 (oom: multi threaded process coredump don't make deadlock)
> > 
> > First, 495789a5 (oom: make oom_score to per-process value) ignores the
> > fact that all threads in a thread group do not necessarily exit at the
> > same time.
> > 
> > It is imperative that select_bad_process() detect threads that are in the
> > exit path, specifically those with PF_EXITING set, to prevent needlessly
> > killing additional tasks.  
> 
> to prevent? No, it is not a reason of PF_EXITING exist.
> 

It is not the sole reason PF_EXITING exists in the kernel, no.  It was 
used in select_bad_process() to ensure we don't needlessly kill another 
task if an eligible one is already in the exit path.  We want to ensure 
that the oom killer only kills a process getting work done when nothing 
has the potential to free memory in the short term.  It's not a guarantee 
that the PF_EXITING task will free memory, but it has the potential to be 
the last thread pinning the ->mm.

> > By iterating over threads instead, it is possible to detect threads that
> > are exiting and nominate them for oom kill so they get access to memory
> > reserves.
> 
> In fact, PF_EXITING is a sing of *THREAD* exiting, not process. Therefore
> PF_EXITING is not a sign of memory freeing in nearly future. If other
> CPUs don't try to free memory, prevent oom and waiting makes deadlock.
> 

It's not a deadlock if a thread is PF_EXITING and isn't stalled by, for 
instance, failed memory allocations.  That's why this patch restores the 
behavior back to what it was previous to cef1d352: if an eligible thread 
is PF_EXITING and is not current, then wait for it; otherwise, if it is 
current, give it access to memory reserves so it can allow the allocation 
to succeed.

> > Second, cef1d352 (oom: multi threaded process coredump don't make
> > deadlock) erroneously avoids making the oom killer a no-op when an
> > eligible thread other than current isfound to be exiting.  We want to
> > detect this situation so that we may allow that exiting thread time to
> > exit and free its memory; if it is able to exit on its own, that should
> > free memory so current is no loner oom.  If it is not able to exit on its
> > own, the oom killer will nominate it for oom kill which, in this case,
> > only means it will get access to memory reserves.
> > 
> > Without this change, it is easy for the oom killer to unnecessarily
> > target tasks when all threads of a victim don't exit before the thread
> > group leader or, in the worst case, panic the machine.
> > 
> 
> You missed deadlock is more worse than panic. And again, task overkill
> is a part of OOM killer design. it is necessary to avoid deadlock. If
> you want to change this spec, you need to remove deadlock change at first.
> 

There is no deadlock being introduced by this patch; if you have an 
example of one, then please show it.  The problem is not just overkill but 
rather panicking the machine when no other eligible processes exist.  We 
have seen this in production quite a few times and we'd like to see this 
patch merged to avoid our machines panicking because the oom killer, by 
your patch, isn't considering threads that are eligible in the exit path 
once their parent has been killed and has exited itself yet memory freeing 
isn't possible yet because the threads still pin the ->mm.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]