Re: [PATCH 3/3 for 2.6.38] oom: oom_kill_process: fix the child_points logic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 14 Mar 2011, Oleg Nesterov wrote:

> oom_kill_process() starts with victim_points == 0. This means that
> (most likely) any child has more points and can be killed erroneously.
> 
> Also, "children has a different mm" doesn't match the reality, we
> should check child->mm != t->mm. This check is not exactly correct
> if t->mm == NULL but this doesn't really matter, oom_kill_task()
> will kill them anyway.
> 
> Note: "Kill all processes sharing p->mm" in oom_kill_task() is wrong
> too.
> 

There're two issues you're addressing in this patch.  It only kills a 
child in place of its selected parent when:

 - the child has a higher badness score, and

 - it has a different ->mm.

In the former case, NACK, we always want to sacrifice children regardless 
of their badness score (as long as it is non-zero) if it has a separate 
->mm in place of its parent, otherwise webservers will be killed instead 
of one of their children serving a client, sshd could be killed instead of 
bash, etc.  The behavior of the oom killer has always been to try to kill 
a child with its own ->mm first to avoid losing a large amount of work 
being done or unnecessarily killing a job scheduler, for example, when 
sacrificing a child would be satisfactory.  It'll kill additional tasks, 
and perhaps even the parent later if it has no more children, if the oom 
condition persists.

In the latter case, I agree, we should be testing if the child has a 
different ->mm before sacrificing it for its parent as the comment 
indicates it will.  I proposed that exact change in "oom: avoid deferring 
oom killer if exiting task is being traced" posted to -mm a couple days 
ago.

> Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>
> ---
> 
>  mm/oom_kill.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> --- 38/mm/oom_kill.c~3_fix_kill_chld	2011-03-14 18:52:39.000000000 +0100
> +++ 38/mm/oom_kill.c	2011-03-14 19:36:01.000000000 +0100
> @@ -459,10 +459,10 @@ static int oom_kill_process(struct task_
>  			    struct mem_cgroup *mem, nodemask_t *nodemask,
>  			    const char *message)
>  {
> -	struct task_struct *victim = p;
> +	struct task_struct *victim;
>  	struct task_struct *child;
> -	struct task_struct *t = p;
> -	unsigned int victim_points = 0;
> +	struct task_struct *t;
> +	unsigned int victim_points;
>  
>  	if (printk_ratelimit())
>  		dump_header(p, gfp_mask, order, mem, nodemask);
> @@ -488,10 +488,15 @@ static int oom_kill_process(struct task_
>  	 * parent.  This attempts to lose the minimal amount of work done while
>  	 * still freeing memory.
>  	 */
> +	victim_points = oom_badness(p, mem, nodemask, totalpages);
> +	victim = p;
> +	t = p;
>  	do {
>  		list_for_each_entry(child, &t->children, sibling) {
>  			unsigned int child_points;
>  
> +			if (child->mm == t->mm)
> +				continue;
>  			/*
>  			 * oom_badness() returns 0 if the thread is unkillable
>  			 */
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]