Re: [patch] oom: prevent unnecessary oom kills or kernel panics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 8 Mar 2011, Oleg Nesterov wrote:

> > > By iterating over threads instead, it is possible to detect threads that
> > > are exiting and nominate them for oom kill so they get access to memory
> > > reserves.
> >
> > In fact, PF_EXITING is a sing of *THREAD* exiting, not process. Therefore
> > PF_EXITING is not a sign of memory freeing in nearly future. If other
> > CPUs don't try to free memory, prevent oom and waiting makes deadlock.
> 
> I agree. I don't understand this patch.
> 

Using for_each_process() does not consider threads that have failed to 
exit after the oom killed parent and, thus, we select another innocent 
task to kill when we're really just waiting for those threads to exit (and 
perhaps they need memory reserves in the exit path) or, in the worst case, 
panic if there is nothing else eligible.

The end result is that without this patch, we sometimes unnecessarily 
panic (and "sometimes" is defined as "many machines" for us) when nothing 
else is eligible for kill within an oom cpuset yet doing a 
do_each_thread() over that cpuset shows threads of previously oom killed 
parent that have yet to exit.

> > > @@ -324,7 +324,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
> > >  		 * the process of exiting and releasing its resources.
> > >  		 * Otherwise we could get an easy OOM deadlock.
> > >  		 */
> > > -		if (thread_group_empty(p) && (p->flags & PF_EXITING) && p->mm) {
> > > +		if ((p->flags & PF_EXITING) && p->mm) {
> 
> The previous check was not perfect, we know this.
> 
> But with this patch applied, the simple program below disables oom-killer
> completely. select_bad_process() can never succeed.
> 

The program illustrates a problem that shouldn't be fixed in 
select_bad_process() but rather in oom_kill_process() when choosing an 
eligible child of the selected task to kill in place of its parent.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]