Re: [PATCH] mm/buddy: fix default NUMA nodes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 10 Jun 2012, Gavin Shan wrote:

> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 7892f84..dda83c5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2474,6 +2474,7 @@ struct page *
>  __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order,
>  			struct zonelist *zonelist, nodemask_t *nodemask)
>  {
> +	nodemask_t *preferred_nodemask = nodemask ? : &cpuset_current_mems_allowed;
>  	enum zone_type high_zoneidx = gfp_zone(gfp_mask);
>  	struct zone *preferred_zone;
>  	struct page *page = NULL;
> @@ -2501,19 +2502,18 @@ retry_cpuset:
>  	cpuset_mems_cookie = get_mems_allowed();
>  
>  	/* The preferred zone is used for statistics later */
> -	first_zones_zonelist(zonelist, high_zoneidx,
> -				nodemask ? : &cpuset_current_mems_allowed,
> +	first_zones_zonelist(zonelist, high_zoneidx, preferred_nodemask,
>  				&preferred_zone);
>  	if (!preferred_zone)
>  		goto out;
>  
>  	/* First allocation attempt */
> -	page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, nodemask, order,
> -			zonelist, high_zoneidx, ALLOC_WMARK_LOW|ALLOC_CPUSET,
> +	page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, preferred_nodemask,
> +			order, zonelist, high_zoneidx, ALLOC_WMARK_LOW|ALLOC_CPUSET,
>  			preferred_zone, migratetype);
>  	if (unlikely(!page))
> -		page = __alloc_pages_slowpath(gfp_mask, order,
> -				zonelist, high_zoneidx, nodemask,
> +		page = __alloc_pages_slowpath(gfp_mask, order, zonelist,
> +				high_zoneidx, preferred_nodemask,
>  				preferred_zone, migratetype);
>  
>  	trace_mm_page_alloc(page, order, gfp_mask, migratetype);

Nack, this is wrong.  The nodemask passed to first_zones_zonelist() is 
only for statistics and is correct as written.  The nodemask passed to 
get_page_from_freelist() constrains the iteration to only those nodes 
which would be done over cpuset_current_mems_allowed with your patch if a 
NULL nodemask is passed into the page allocator (meaning it has a default 
mempolicy).  Allocations on non-cpuset nodes are allowed in some 
contexts, see cpuset_zone_allowed_softwall(), so this would cause a 
regression.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]