Re: [patch 3/7 -mm] oom: select task from tasklist for mempolicy ooms

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 15 Feb 2010, KOSAKI Motohiro wrote:

> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -1638,6 +1638,45 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask)
> >  }
> >  #endif
> >  
> > +/*
> > + * mempolicy_nodemask_intersects
> > + *
> > + * If tsk's mempolicy is "default" [NULL], return 'true' to indicate default
> > + * policy.  Otherwise, check for intersection between mask and the policy
> > + * nodemask for 'bind' or 'interleave' policy, or mask to contain the single
> > + * node for 'preferred' or 'local' policy.
> > + */
> > +bool mempolicy_nodemask_intersects(struct task_struct *tsk,
> > +					const nodemask_t *mask)
> > +{
> > +	struct mempolicy *mempolicy;
> > +	bool ret = true;
> > +
> > +	mempolicy = tsk->mempolicy;
> > +	mpol_get(mempolicy);
> 
> Why is this refcount increment necessary? mempolicy is grabbed by tsk,
> IOW it never be freed in this function.
> 

We need to get a refcount on the mempolicy to ensure it doesn't get freed 
from under us, tsk is not necessarily current.

> 
> > +	if (!mask || !mempolicy)
> > +		goto out;
> > +
> > +	switch (mempolicy->mode) {
> > +	case MPOL_PREFERRED:
> > +		if (mempolicy->flags & MPOL_F_LOCAL)
> > +			ret = node_isset(numa_node_id(), *mask);
> 
> Um? Is this good heuristic?
> The task can migrate various cpus, then "node_isset(numa_node_id(), *mask) == 0"
> doesn't mean the task doesn't consume *mask's memory.
> 

For MPOL_F_LOCAL, we need to check whether the task's cpu is on a node 
that is allowed by the zonelist passed to the page allocator.  In the 
second revision of this patchset, this was changed to

	node_isset(cpu_to_node(task_cpu(tsk)), *mask)

to check.  It would be possible for no memory to have been allocated on 
that node and it just happens that the tsk is running on it momentarily, 
but it's the best indication we have given the mempolicy of whether 
killing a task may lead to future memory freeing.

> > @@ -660,24 +683,18 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
> >  	 */
> >  	constraint = constrained_alloc(zonelist, gfp_mask, nodemask);
> >  	read_lock(&tasklist_lock);
> > -
> > -	switch (constraint) {
> > -	case CONSTRAINT_MEMORY_POLICY:
> > -		oom_kill_process(current, gfp_mask, order, 0, NULL,
> > -				"No available memory (MPOL_BIND)");
> > -		break;
> > -
> > -	case CONSTRAINT_NONE:
> > -		if (sysctl_panic_on_oom) {
> > +	if (unlikely(sysctl_panic_on_oom)) {
> > +		/*
> > +		 * panic_on_oom only affects CONSTRAINT_NONE, the kernel
> > +		 * should not panic for cpuset or mempolicy induced memory
> > +		 * failures.
> > +		 */
> > +		if (constraint == CONSTRAINT_NONE) {
> >  			dump_header(NULL, gfp_mask, order, NULL);
> > -			panic("out of memory. panic_on_oom is selected\n");
> > +			panic("Out of memory: panic_on_oom is enabled\n");
> 
> enabled? Its feature is enabled at boot time. triggered? or fired?
> 

The panic_on_oom sysctl is "enabled" if it is set to non-zero; that's the 
word used throughout Documentation/sysctl/vm.txt to describe when a sysctl 
is being used or not.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]