Re: [patch] oom: prevent unnecessary oom kills or kernel panics

KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> · Sun, 6 Mar 2011 20:14:31 +0900 (JST)

> On Thu, 3 Mar 2011, KOSAKI Motohiro wrote:
> 
> > > This patch revents unnecessary oom kills or kernel panics by reverting
> > > two commits:
> > > 
> > > 	495789a5 (oom: make oom_score to per-process value)
> > > 	cef1d352 (oom: multi threaded process coredump don't make deadlock)
> > > 
> > > First, 495789a5 (oom: make oom_score to per-process value) ignores the
> > > fact that all threads in a thread group do not necessarily exit at the
> > > same time.
> > > 
> > > It is imperative that select_bad_process() detect threads that are in the
> > > exit path, specifically those with PF_EXITING set, to prevent needlessly
> > > killing additional tasks.  
> > 
> > to prevent? No, it is not a reason of PF_EXITING exist.
> > 
> 
> It is not the sole reason PF_EXITING exists in the kernel, no.  It was 
> used in select_bad_process() to ensure we don't needlessly kill another 
> task if an eligible one is already in the exit path.  We want to ensure 
> that the oom killer only kills a process getting work done when nothing 
> has the potential to free memory in the short term.  It's not a guarantee 
> that the PF_EXITING task will free memory, but it has the potential to be 
> the last thread pinning the ->mm.
> 
> > > By iterating over threads instead, it is possible to detect threads that
> > > are exiting and nominate them for oom kill so they get access to memory
> > > reserves.
> > 
> > In fact, PF_EXITING is a sing of *THREAD* exiting, not process. Therefore
> > PF_EXITING is not a sign of memory freeing in nearly future. If other
> > CPUs don't try to free memory, prevent oom and waiting makes deadlock.
> > 
> 
> It's not a deadlock if a thread is PF_EXITING and isn't stalled by, for 
> instance, failed memory allocations.  That's why this patch restores the 
> behavior back to what it was previous to cef1d352: if an eligible thread 
> is PF_EXITING and is not current, then wait for it; otherwise, if it is 
> current, give it access to memory reserves so it can allow the allocation 
> to succeed.
> 
> > > Second, cef1d352 (oom: multi threaded process coredump don't make
> > > deadlock) erroneously avoids making the oom killer a no-op when an
> > > eligible thread other than current isfound to be exiting.  We want to
> > > detect this situation so that we may allow that exiting thread time to
> > > exit and free its memory; if it is able to exit on its own, that should
> > > free memory so current is no loner oom.  If it is not able to exit on its
> > > own, the oom killer will nominate it for oom kill which, in this case,
> > > only means it will get access to memory reserves.
> > > 
> > > Without this change, it is easy for the oom killer to unnecessarily
> > > target tasks when all threads of a victim don't exit before the thread
> > > group leader or, in the worst case, panic the machine.
> > > 
> > 
> > You missed deadlock is more worse than panic. And again, task overkill
> > is a part of OOM killer design. it is necessary to avoid deadlock. If
> > you want to change this spec, you need to remove deadlock change at first.
> > 
> 
> There is no deadlock being introduced by this patch; if you have an 
> example of one, then please show it.  The problem is not just overkill but 
> rather panicking the machine when no other eligible processes exist.  We 
> have seen this in production quite a few times and we'd like to see this 
> patch merged to avoid our machines panicking because the oom killer, by 
> your patch, isn't considering threads that are eligible in the exit path 
> once their parent has been killed and has exited itself yet memory freeing 
> isn't possible yet because the threads still pin the ->mm.

No. While you don't understand current code, I'll not taking yours.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>