Here is a draft of a patch to make this work with memoryless nodes. The first thing is that we modify node_match to also match if we hit an empty node. In that case we simply take the current slab if its there. If there is no current slab then a regular allocation occurs with the memoryless node. The page allocator will fallback to a possible node and that will become the current slab. Next alloc from a memoryless node will then use that slab. For that we also add some tracking of allocations on nodes that were not satisfied using the empty_node[] array. A successful alloc on a node clears that flag. I would rather avoid the empty_node[] array since its global and there may be thread specific allocation restrictions but it would be expensive to do an allocation attempt via the page allocator to make sure that there is really no page available from the page allocator. Index: linux/mm/slub.c =================================================================== --- linux.orig/mm/slub.c 2014-02-03 13:19:22.896853227 -0600 +++ linux/mm/slub.c 2014-02-07 12:44:49.311494806 -0600 @@ -132,6 +132,8 @@ static inline bool kmem_cache_has_cpu_pa #endif } +static int empty_node[MAX_NUMNODES]; + /* * Issues still to be resolved: * @@ -1405,16 +1407,22 @@ static struct page *new_slab(struct kmem void *last; void *p; int order; + int alloc_node; BUG_ON(flags & GFP_SLAB_BUG_MASK); page = allocate_slab(s, flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node); - if (!page) + if (!page) { + if (node != NUMA_NO_NODE) + empty_node[node] = 1; goto out; + } order = compound_order(page); - inc_slabs_node(s, page_to_nid(page), page->objects); + alloc_node = page_to_nid(page); + empty_node[alloc_node] = 0; + inc_slabs_node(s, alloc_node, page->objects); memcg_bind_pages(s, order); page->slab_cache = s; __SetPageSlab(page); @@ -1712,7 +1720,7 @@ static void *get_partial(struct kmem_cac struct kmem_cache_cpu *c) { void *object; - int searchnode = (node == NUMA_NO_NODE) ? numa_node_id() : node; + int searchnode = (node == NUMA_NO_NODE) ? numa_mem_id() : node; object = get_partial_node(s, get_node(s, searchnode), c, flags); if (object || node != NUMA_NO_NODE) @@ -2107,8 +2115,25 @@ static void flush_all(struct kmem_cache static inline int node_match(struct page *page, int node) { #ifdef CONFIG_NUMA - if (!page || (node != NUMA_NO_NODE && page_to_nid(page) != node)) + int page_node; + + /* No data means no match */ + if (!page) return 0; + + /* Node does not matter. Therefore anything is a match */ + if (node == NUMA_NO_NODE) + return 1; + + /* Did we hit the requested node ? */ + page_node = page_to_nid(page); + if (page_node == node) + return 1; + + /* If the node has available data then we can use it. Mismatch */ + return !empty_node[page_node]; + + /* Target node empty so just take anything */ #endif return 1; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>