On 24.01.2014 [15:49:33 -0800], David Rientjes wrote: > On Fri, 24 Jan 2014, Nishanth Aravamudan wrote: > > > > I think the problem is a memoryless node being used for kmalloc_node() so > > > we need to decide where to enforce node_present_pages(). __slab_alloc() > > > seems like the best candidate when !node_match(). > > > > Actually, this is effectively what Anton's patch does, except with > > Wanpeng's adjustment to use node_present_pages(). Does that seem > > sufficient to you? > > > > I don't see that as being the effect of Anton's patch. We need to use > numa_mem_id() as Christoph mentioned when a memoryless node is passed for > the best NUMA locality. Something like this: Thank you for clarifying and providing a test patch. I ran with this on the system showing the original problem, configured to have 15GB of memory. With your patch after boot: MemTotal: 15604736 kB MemFree: 8768192 kB Slab: 3882560 kB SReclaimable: 105408 kB SUnreclaim: 3777152 kB With Anton's patch after boot: MemTotal: 15604736 kB MemFree: 11195008 kB Slab: 1427968 kB SReclaimable: 109184 kB SUnreclaim: 1318784 kB I know that's fairly unscientific, but the numbers are reproducible. For what it's worth, a sample of the unmodified numbers: MemTotal: 15317632 kB MemFree: 5023424 kB Slab: 7176064 kB SReclaimable: 106816 kB SUnreclaim: 7069248 kB So it's an improvement, but something is still causing us to (it seems) be pretty inefficient with the slabs. > diff --git a/mm/slub.c b/mm/slub.c > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -2278,10 +2278,14 @@ redo: > > if (unlikely(!node_match(page, node))) { > stat(s, ALLOC_NODE_MISMATCH); > - deactivate_slab(s, page, c->freelist); > - c->page = NULL; > - c->freelist = NULL; > - goto new_slab; > + if (unlikely(!node_present_pages(node))) > + node = numa_mem_id(); > + if (!node_match(page, node)) { > + deactivate_slab(s, page, c->freelist); > + c->page = NULL; > + c->freelist = NULL; > + goto new_slab; > + } Semantically, and please correct me if I'm wrong, this patch is saying if we have a memoryless node, we expect the page's locality to be that of numa_mem_id(), and we still deactivate the slab if that isn't true. Just wanting to make sure I understand the intent. What I find odd is that there are only 2 nodes on this system, node 0 (empty) and node 1. So won't numa_mem_id() always be 1? And every page should be coming from node 1 (thus node_match() should always be true?) Thanks, Nish -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>