- fix-interleave-with-memoryless-nodes.patch removed from -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Fix INTERLEAVE with memoryless nodes
has been removed from the -mm tree.  Its filename was
     fix-interleave-with-memoryless-nodes.patch

This patch was dropped because it hasn't been tested yet

------------------------------------------------------
Subject: Fix INTERLEAVE with memoryless nodes
From: Nishanth Aravamudan <nacc@xxxxxxxxxx>

Based on ideas from Christoph Lameter, add checks in the INTERLEAVE paths
for memoryless nodes.  We do not want to try interleaving onto those nodes.

Christoph said:
"This does not work for the address based interleaving for anonymous
vmas.  I am not sure what to do there. We could change the calculation
of the node to be based only on nodes with memory and then skip the
memoryless ones. I have only added a comment to describe its brokennes
for now."

I have copied his draft's comment.

Change alloc_pages_node() to fail __GFP_THISNODE allocations if the node
is not populated.

Again, Christoph said:
"This will fix the alloc_pages_node case but not the alloc_pages() case.
In the alloc_pages() case we do not specify a node. Implicitly it is
understood that we (in the case of no memory policy / cpuset options)
allocate from the nearest node. So it may be argued there that the
GFP_THISNODE behavior of taking the first node from the zonelist is
okay."

Christoph was also worried about the performance impact on these paths,
as am I.

Finally, as he suggested, uninline alloc_pages_node() and move it to
mempolicy.c.

Signed-off-by: Nishanth Aravamudan <nacc@xxxxxxxxxx>
Acked-by: Christoph Lameter <clameter@xxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 include/linux/gfp.h |   14 +-------------
 mm/mempolicy.c      |   37 ++++++++++++++++++++++++++++++++-----
 2 files changed, 33 insertions(+), 18 deletions(-)

diff -puN include/linux/gfp.h~fix-interleave-with-memoryless-nodes include/linux/gfp.h
--- a/include/linux/gfp.h~fix-interleave-with-memoryless-nodes
+++ a/include/linux/gfp.h
@@ -130,19 +130,7 @@ static inline void arch_alloc_page(struc
 extern struct page *
 FASTCALL(__alloc_pages(gfp_t, unsigned int, struct zonelist *));
 
-static inline struct page *alloc_pages_node(int nid, gfp_t gfp_mask,
-						unsigned int order)
-{
-	if (unlikely(order >= MAX_ORDER))
-		return NULL;
-
-	/* Unknown node is current node */
-	if (nid < 0)
-		nid = numa_node_id();
-
-	return __alloc_pages(gfp_mask, order,
-		NODE_DATA(nid)->node_zonelists + gfp_zone(gfp_mask));
-}
+extern struct page *alloc_pages_node(int, gfp_t, unsigned int);
 
 #ifdef CONFIG_NUMA
 extern struct page *alloc_pages_current(gfp_t gfp_mask, unsigned order);
diff -puN mm/mempolicy.c~fix-interleave-with-memoryless-nodes mm/mempolicy.c
--- a/mm/mempolicy.c~fix-interleave-with-memoryless-nodes
+++ a/mm/mempolicy.c
@@ -184,8 +184,12 @@ static struct mempolicy *mpol_new(int mo
 	atomic_set(&policy->refcnt, 1);
 	switch (mode) {
 	case MPOL_INTERLEAVE:
-		policy->v.nodes = *nodes;
-		if (nodes_weight(*nodes) == 0) {
+		/*
+		 * Clear any memoryless nodes here so that v.nodes can be used
+		 * without extra checks
+		 */
+		nodes_and(policy->v.nodes, *nodes, node_populated_map);
+		if (nodes_weight(policy->v.nodes) == 0) {
 			kmem_cache_free(policy_cache, policy);
 			return ERR_PTR(-EINVAL);
 		}
@@ -578,6 +582,22 @@ long do_get_mempolicy(int *policy, nodem
 	return err;
 }
 
+struct page *alloc_pages_node(int nid, gfp_t gfp_mask, unsigned int order)
+{
+	if (unlikely(order >= MAX_ORDER))
+		return NULL;
+
+	/* Unknown node is current node */
+	if (nid < 0)
+		nid = numa_node_id();
+
+	if ((gfp_mask & __GFP_THISNODE) && !node_populated(nid))
+		return NULL;
+
+	return __alloc_pages(gfp_mask, order,
+		NODE_DATA(nid)->node_zonelists + gfp_zone(gfp_mask));
+}
+
 #ifdef CONFIG_MIGRATION
 /*
  * page migration
@@ -1125,9 +1145,11 @@ static unsigned interleave_nodes(struct 
 	struct task_struct *me = current;
 
 	nid = me->il_next;
-	next = next_node(nid, policy->v.nodes);
-	if (next >= MAX_NUMNODES)
-		next = first_node(policy->v.nodes);
+	do {
+		next = next_node(nid, policy->v.nodes);
+		if (next >= MAX_NUMNODES)
+			next = first_node(policy->v.nodes);
+	} while (!node_populated(next));
 	me->il_next = next;
 	return nid;
 }
@@ -1191,6 +1213,11 @@ static inline unsigned interleave_nid(st
 		 * for huge pages, since vm_pgoff is in units of small
 		 * pages, we need to shift off the always 0 bits to get
 		 * a useful offset.
+		 *
+		 * NOTE: For configurations with memoryless nodes this
+		 * is broken since the allocation attempts on that node
+		 * will fall back to other nodes and thus one
+		 * neighboring node will be overallocated from.
 		 */
 		BUG_ON(shift < PAGE_SHIFT);
 		off = vma->vm_pgoff >> (shift - PAGE_SHIFT);
_

Patches currently in -mm which might be from nacc@xxxxxxxxxx are

hugetlb-remove-unnecessary-nid-initialization.patch
gfph-gfp_thisnode-can-go-to-other-nodes-if-some-are-unpopulated.patch
add-populated_map-to-account-for-memoryless-nodes.patch
add-populated_map-to-account-for-memoryless-nodes-fix.patch
fix-interleave-with-memoryless-nodes.patch

-
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Kernel Newbies FAQ]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux