On Wed, 19 Jan 2011, Minchan Kim wrote: > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -2034,6 +2034,18 @@ restart: > > */ > > alloc_flags = gfp_to_alloc_flags(gfp_mask); > > > > + /* > > + * If preferred_zone cannot be allocated from in this context, find the > > + * first allowable zone instead. > > + */ > > + if ((alloc_flags & ALLOC_CPUSET) && > > + !cpuset_zone_allowed_softwall(preferred_zone, gfp_mask)) { > > + first_zones_zonelist(zonelist, high_zoneidx, > > + &cpuset_current_mems_allowed, &preferred_zone); > > This patch is one we need. but I have a nitpick. > I am not familiar with CPUSET so I might be wrong. > > I think it could make side effect of statistics of ZVM on > buffered_rmqueue since you intercept and change preferred_zone. > It could make NUMA_HIT instead of NUMA_MISS. > Is it your intention? > It depends on the semantics of NUMA_MISS: if no local nodes are allowed by current's cpuset (a pretty poor cpuset config :), then it seems logical that all allocations would be a miss.