Re: [RFC PATCH 2/3] topology: support node_numa_mem() for determining the fallback node

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Here is a draft of a patch to make this work with memoryless nodes.

The first thing is that we modify node_match to also match if we hit an
empty node. In that case we simply take the current slab if its there.

If there is no current slab then a regular allocation occurs with the
memoryless node. The page allocator will fallback to a possible node and
that will become the current slab. Next alloc from a memoryless node
will then use that slab.

For that we also add some tracking of allocations on nodes that were not
satisfied using the empty_node[] array. A successful alloc on a node
clears that flag.

I would rather avoid the empty_node[] array since its global and there may
be thread specific allocation restrictions but it would be expensive to do
an allocation attempt via the page allocator to make sure that there is
really no page available from the page allocator.

Index: linux/mm/slub.c
===================================================================
--- linux.orig/mm/slub.c	2014-02-03 13:19:22.896853227 -0600
+++ linux/mm/slub.c	2014-02-07 12:44:49.311494806 -0600
@@ -132,6 +132,8 @@ static inline bool kmem_cache_has_cpu_pa
 #endif
 }

+static int empty_node[MAX_NUMNODES];
+
 /*
  * Issues still to be resolved:
  *
@@ -1405,16 +1407,22 @@ static struct page *new_slab(struct kmem
 	void *last;
 	void *p;
 	int order;
+	int alloc_node;

 	BUG_ON(flags & GFP_SLAB_BUG_MASK);

 	page = allocate_slab(s,
 		flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
-	if (!page)
+	if (!page) {
+		if (node != NUMA_NO_NODE)
+			empty_node[node] = 1;
 		goto out;
+	}

 	order = compound_order(page);
-	inc_slabs_node(s, page_to_nid(page), page->objects);
+	alloc_node = page_to_nid(page);
+	empty_node[alloc_node] = 0;
+	inc_slabs_node(s, alloc_node, page->objects);
 	memcg_bind_pages(s, order);
 	page->slab_cache = s;
 	__SetPageSlab(page);
@@ -1712,7 +1720,7 @@ static void *get_partial(struct kmem_cac
 		struct kmem_cache_cpu *c)
 {
 	void *object;
-	int searchnode = (node == NUMA_NO_NODE) ? numa_node_id() : node;
+	int searchnode = (node == NUMA_NO_NODE) ? numa_mem_id() : node;

 	object = get_partial_node(s, get_node(s, searchnode), c, flags);
 	if (object || node != NUMA_NO_NODE)
@@ -2107,8 +2115,25 @@ static void flush_all(struct kmem_cache
 static inline int node_match(struct page *page, int node)
 {
 #ifdef CONFIG_NUMA
-	if (!page || (node != NUMA_NO_NODE && page_to_nid(page) != node))
+	int page_node;
+
+	/* No data means no match */
+	if (!page)
 		return 0;
+
+	/* Node does not matter. Therefore anything is a match */
+	if (node == NUMA_NO_NODE)
+		return 1;
+
+	/* Did we hit the requested node ? */
+	page_node = page_to_nid(page);
+	if (page_node == node)
+		return 1;
+
+	/* If the node has available data then we can use it. Mismatch */
+	return !empty_node[page_node];
+
+	/* Target node empty so just take anything */
 #endif
 	return 1;
 }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]