On Tue, Jan 07, 2014 at 04:48:40PM +0800, Wanpeng Li wrote: > Hi Joonsoo, > On Tue, Jan 07, 2014 at 04:41:36PM +0900, Joonsoo Kim wrote: > >On Tue, Jan 07, 2014 at 01:21:00PM +1100, Anton Blanchard wrote: > >> > [...] > >Hello, > > > >I think that we need more efforts to solve unbalanced node problem. > > > >With this patch, even if node of current cpu slab is not favorable to > >unbalanced node, allocation would proceed and we would get the unintended memory. > > > > We have a machine: > > [ 0.000000] Node 0 Memory: > [ 0.000000] Node 4 Memory: 0x0-0x10000000 0x20000000-0x60000000 0x80000000-0xc0000000 > [ 0.000000] Node 6 Memory: 0x10000000-0x20000000 0x60000000-0x80000000 > [ 0.000000] Node 10 Memory: 0xc0000000-0x180000000 > > [ 0.041486] Node 0 CPUs: 0-19 > [ 0.041490] Node 4 CPUs: > [ 0.041492] Node 6 CPUs: > [ 0.041495] Node 10 CPUs: > > The pages of current cpu slab should be allocated from fallback zones/nodes > of the memoryless node in buddy system, how can not favorable happen? Hi, Wanpeng. IIRC, if we call kmem_cache_alloc_node() with certain node #, we try to allocate the page in fallback zones/node of that node #. So fallback list isn't related to fallback one of memoryless node #. Am I wrong? Thanks. > > >And there is one more problem. Even if we have some partial slabs on > >compatible node, we would allocate new slab, because get_partial() cannot handle > >this unbalance node case. > > > >To fix this correctly, how about following patch? > > > > So I think we should fold both of your two patches to one. > > Regards, > Wanpeng Li > > >Thanks. > > > >------------->8-------------------- > >diff --git a/mm/slub.c b/mm/slub.c > >index c3eb3d3..a1f6dfa 100644 > >--- a/mm/slub.c > >+++ b/mm/slub.c > >@@ -1672,7 +1672,19 @@ static void *get_partial(struct kmem_cache *s, gfp_t flags, int node, > > { > > void *object; > > int searchnode = (node == NUMA_NO_NODE) ? numa_node_id() : node; > >+ struct zonelist *zonelist; > >+ struct zoneref *z; > >+ struct zone *zone; > >+ enum zone_type high_zoneidx = gfp_zone(flags); > > > >+ if (!node_present_pages(searchnode)) { > >+ zonelist = node_zonelist(searchnode, flags); > >+ for_each_zone_zonelist(zone, z, zonelist, high_zoneidx) { > >+ searchnode = zone_to_nid(zone); > >+ if (node_present_pages(searchnode)) > >+ break; > >+ } > >+ } > > object = get_partial_node(s, get_node(s, searchnode), c, flags); > > if (object || node != NUMA_NO_NODE) > > return object; > > > >-- > >To unsubscribe, send a message with 'unsubscribe linux-mm' in > >the body to majordomo@xxxxxxxxx. For more info on Linux MM, > >see: http://www.linux-mm.org/ . > >Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@xxxxxxxxx. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>