Re: [patch] oom: give current access to memory reserves if it has been killed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 1 Apr 2010, Oleg Nesterov wrote:

> > > > > Say, oom_forkbomb_penalty() does list_for_each_entry(tsk->children).
> > > > > Again, this is not right even if we forget about !child->mm check.
> > > > > This list_for_each_entry() can only see the processes forked by the
> > > > > main thread.
> > > > >
> > > >
> > > > That's the intention.
> > >
> > > Why? shouldn't oom_badness() return the same result for any thread
> > > in thread group? We should take all childs into account.
> > >
> >
> > oom_forkbomb_penalty() only cares about first-descendant children that
> > do not share the same memory,
> 
> I see, but the code doesn't really do this. I mean, it doesn't really
> see the first-descendant children, only those which were forked by the
> main thread.
> 
> Look. We have a main thread M and the sub-thread T. T forks a lot of
> processes which use a lot of memory. These processes _are_ the first
> descendant children of the M+T thread group, they should be accounted.
> But M->children list is empty.
> 
> oom_forkbomb_penalty() and oom_kill_process() should do
> 
> 	t = tsk;
> 	do {
> 		list_for_each_entry(child, &t->children, sibling) {
> 			... take child into account ...
> 		}
> 	} while_each_thread(tsk, t);
> 
> 

In this case, it seems more appropriate that we would penalize T and not M 
since it's not necessarily responsible for the behavior of the children it 
forks.  T is the buggy/malicious program, not M.

> See the patch below. Yes, this is minor, but it is always good to avoid
> the unnecessary locks, and thread_group_cputime() is O(N).
> 
> Not only for performance reasons. This allows to change the locking in
> thread_group_cputime() if needed without fear to deadlock with task_lock().
> 
> Oleg.
> 
> --- x/mm/oom_kill.c
> +++ x/mm/oom_kill.c
> @@ -97,13 +97,16 @@ static unsigned long oom_forkbomb_penalt
>  		return 0;
>  	list_for_each_entry(child, &tsk->children, sibling) {
>  		struct task_cputime task_time;
> -		unsigned long runtime;
> +		unsigned long runtime, this_rss;
>  
>  		task_lock(child);
>  		if (!child->mm || child->mm == tsk->mm) {
>  			task_unlock(child);
>  			continue;
>  		}
> +		this_rss = get_mm_rss(child->mm);
> +		task_unlock(child);
> +
>  		thread_group_cputime(child, &task_time);
>  		runtime = cputime_to_jiffies(task_time.utime) +
>  			  cputime_to_jiffies(task_time.stime);
> @@ -113,10 +116,9 @@ static unsigned long oom_forkbomb_penalt
>  		 * get to execute at all in such cases anyway.
>  		 */
>  		if (runtime < HZ) {
> -			child_rss += get_mm_rss(child->mm);
> +			child_rss += this_rss;
>  			forkcount++;
>  		}
> -		task_unlock(child);
>  	}
>  
>  	/*

This patch looks good, will you send it to Andrew with a changelog and 
sign-off line?  Also feel free to add:

Acked-by: David Rientjes <rientjes@xxxxxxxxxx>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]