Re: [patch 09/18] oom: select task from tasklist for mempolicy ooms

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 6 Jun 2010 15:34:31 -0700 (PDT)
David Rientjes <rientjes@xxxxxxxxxx> wrote:

> The oom killer presently kills current whenever there is no more memory
> free or reclaimable on its mempolicy's nodes.  There is no guarantee that
> current is a memory-hogging task or that killing it will free any
> substantial amount of memory, however.
> 
> In such situations, it is better to scan the tasklist for nodes that are
> allowed to allocate on current's set of nodes and kill the task with the
> highest badness() score.  This ensures that the most memory-hogging task,
> or the one configured by the user with /proc/pid/oom_adj, is always
> selected in such scenarios.
> 
>
> ...
>
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -27,6 +27,7 @@
>  #include <linux/module.h>
>  #include <linux/notifier.h>
>  #include <linux/memcontrol.h>
> +#include <linux/mempolicy.h>
>  #include <linux/security.h>
>  
>  int sysctl_panic_on_oom;
> @@ -36,20 +37,36 @@ static DEFINE_SPINLOCK(zone_scan_lock);
>  /* #define DEBUG */
>  
>  /*
> - * Is all threads of the target process nodes overlap ours?
> + * Do all threads of the target process overlap our allowed nodes?
> + * @tsk: task struct of which task to consider
> + * @mask: nodemask passed to page allocator for mempolicy ooms

The comment uses kerneldoc annotation but isn't a kerneldoc comment.

>   */
> -static int has_intersects_mems_allowed(struct task_struct *tsk)
> +static bool has_intersects_mems_allowed(struct task_struct *tsk,
> +					const nodemask_t *mask)
>  {
> -	struct task_struct *t;
> +	struct task_struct *start = tsk;
>  
> -	t = tsk;
>  	do {
> -		if (cpuset_mems_allowed_intersects(current, t))
> -			return 1;
> -		t = next_thread(t);
> -	} while (t != tsk);
> -
> -	return 0;
> +		if (mask) {
> +			/*
> +			 * If this is a mempolicy constrained oom, tsk's
> +			 * cpuset is irrelevant.  Only return true if its
> +			 * mempolicy intersects current, otherwise it may be
> +			 * needlessly killed.
> +			 */
> +			if (mempolicy_nodemask_intersects(tsk, mask))
> +				return true;

The comment refers to `current' but the code does not?

> +		} else {
> +			/*
> +			 * This is not a mempolicy constrained oom, so only
> +			 * check the mems of tsk's cpuset.
> +			 */

The comment doesn't refer to `current', but the code does.  Confused.

> +			if (cpuset_mems_allowed_intersects(current, tsk))
> +				return true;
> +		}
> +		tsk = next_thread(tsk);

hm, next_thread() uses list_entry_rcu().  What are the locking rules
here?  It's one of both of rcu_read_lock() and read_lock(&tasklist_lock),
I think?

> +	} while (tsk != start);
> +	return false;
>  }

This is all bloat and overhead for non-NUMA builds.  I doubt if gcc is
able to eliminate the task_struct walk (although I didn't check).

The function isn't oom-killer-specific at all - give it a better name
then move it to mempolicy.c or similar?  If so, the text "oom"
shouldn't appear in the comments.

>
> ...
>
> @@ -676,24 +699,19 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
>  	 */
>  	constraint = constrained_alloc(zonelist, gfp_mask, nodemask);
>  	read_lock(&tasklist_lock);
> -
> -	switch (constraint) {
> -	case CONSTRAINT_MEMORY_POLICY:
> -		oom_kill_process(current, gfp_mask, order, 0, NULL,
> -				"No available memory (MPOL_BIND)");
> -		break;
> -
> -	case CONSTRAINT_NONE:
> -		if (sysctl_panic_on_oom) {
> +	if (unlikely(sysctl_panic_on_oom)) {
> +		/*
> +		 * panic_on_oom only affects CONSTRAINT_NONE, the kernel
> +		 * should not panic for cpuset or mempolicy induced memory
> +		 * failures.
> +		 */

This wasn't changelogged?

> +		if (constraint == CONSTRAINT_NONE) {
>  			dump_header(NULL, gfp_mask, order, NULL);
> -			panic("out of memory. panic_on_oom is selected\n");
> +			read_unlock(&tasklist_lock);
> +			panic("Out of memory: panic_on_oom is enabled\n");
>  		}
> -		/* Fall-through */
> -	case CONSTRAINT_CPUSET:
> -		__out_of_memory(gfp_mask, order);
> -		break;
>  	}
> -
> +	__out_of_memory(gfp_mask, order, constraint, nodemask);
>  	read_unlock(&tasklist_lock);
>  
>  	/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]