[PATCH/RFC 3/8] numa - Migrate-on-Fault - check for misplaced page

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Migrate-on-fault - check for misplaced page

This patch provides a new function to test whether a page resides
on a node that is appropriate for the mempolicy for the vma and
address where the page is supposed to be mapped.  This involves
looking up the node where the page belongs.  So, the function
returns that node so that it may be used to allocated the page
without consulting the policy again.  Because interleaved and
non-interleaved allocations are accounted differently, the function
also returns whether or not the new node came from an interleaved
policy, if the page is misplaced.

A subsequent patch will call this function from the fault path for
stable pages with zero page_mapcount().  Because of this, I don't
want to go ahead and allocate the page, e.g., via alloc_page_vma()
only to have to free it if it has the correct policy.  So, I just
mimic the alloc_page_vma() node computation logic--sort of.

Note:  we could use this function to implement a MPOL_MF_STRICT
behavior when migrating pages to match mbind() mempolicy--e.g.,
to ensure that pages in an interleaved range are reinterleaved
rather than left where they are when they reside on any page in
the interleave nodemask.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@xxxxxx>

 include/linux/mempolicy.h |    9 ++++
 mm/mempolicy.c            |   84 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 93 insertions(+)

Index: linux-2.6.36-mmotm-101103-1217/mm/mempolicy.c
===================================================================
--- linux-2.6.36-mmotm-101103-1217.orig/mm/mempolicy.c
+++ linux-2.6.36-mmotm-101103-1217/mm/mempolicy.c
@@ -3009,3 +3009,87 @@ struct mpol_range *get_numa_submap(struc
 	spin_unlock(&sp->lock);
 	return ranges;
 }
+
+#ifdef CONFIG_MIGRATE_ON_FAULT
+/**
+ * mpol_misplaced - check whether current page node id valid in policy
+ *
+ * @page   - page to be checked
+ * @vma    - vm area where page mapped
+ * @addr   - virtual address where page mapped
+ * @newnid - [ptr to] node id to which page should be migrated
+ *
+ * Lookup current policy node id for vma,addr and "compare to" page's
+ * node id.
+ * If page valid in policy, return 0 -- !misplaced:  reuse current page
+ * Else
+ *     return destination nid via newnid, if !NULL
+ *     return MPOL_MIGRATE_NONINTERLEAVED for non-interleaved policy
+ *     return MPOL_MIGRATE_INTERLEAVED for interleaved policy.
+ * Policy determination "mimics" alloc_page_vma().
+ * Called from fault path where we know the vma and faulting address.
+ */
+int mpol_misplaced(struct page *page, struct vm_area_struct *vma,
+			 unsigned long addr, int *newnid)
+{
+	struct mempolicy *pol;
+	struct zone *zone;
+	int curnid = page_to_nid(page);
+	int polnid = -1;
+	int ret = 0;
+
+	BUG_ON(!vma);
+
+	pol = get_vma_policy(current, vma, addr);
+
+	if (unlikely(pol->mode == MPOL_INTERLEAVE)) {
+		unsigned long pgoff;
+
+		BUG_ON(addr >= vma->vm_end);
+		BUG_ON(addr < vma->vm_start);
+
+		pgoff = vma->vm_pgoff;
+		pgoff += (addr - vma->vm_start) >> PAGE_SHIFT;
+		polnid = offset_il_node(pol, pgoff);
+
+		if (curnid != polnid)
+			ret = MPOL_MIGRATE_INTERLEAVED;
+		goto out;
+	}
+
+	switch (pol->mode) {
+	case MPOL_PREFERRED:
+		if (pol->flags & MPOL_F_LOCAL)
+			polnid = numa_node_id();
+		else
+			polnid = pol->v.preferred_node;
+		break;
+	case MPOL_BIND:
+		/*
+		 * allows binding to multiple nodes.
+		 * use current page if in policy nodemask,
+		 * else select nearest allowed node, if any.
+		 * If no allowed nodes, use current [!misplaced].
+		 */
+		if (node_isset(curnid, pol->v.nodes))
+			goto out;
+		(void)first_zones_zonelist(
+				node_zonelist(numa_node_id(), GFP_HIGHUSER),
+				gfp_zone(GFP_HIGHUSER),
+				&pol->v.nodes, &zone);
+		polnid = zone->node;
+		break;
+
+	default:
+		BUG();
+	}
+
+	if (curnid != polnid)
+		ret = MPOL_MIGRATE_NONINTERLEAVED;
+out:
+	mpol_cond_put(pol);
+	if (ret && newnid)
+		*newnid = polnid;
+	return ret;
+}
+#endif /* _MIGRATE_ON_FAULT */
Index: linux-2.6.36-mmotm-101103-1217/include/linux/mempolicy.h
===================================================================
--- linux-2.6.36-mmotm-101103-1217.orig/include/linux/mempolicy.h
+++ linux-2.6.36-mmotm-101103-1217/include/linux/mempolicy.h
@@ -245,6 +245,15 @@ extern int show_numa_map(struct seq_file
 struct mpol_range;
 extern struct mpol_range *get_numa_submap(struct vm_area_struct *);
 
+#ifdef CONFIG_MIGRATE_ON_FAULT
+#define MPOL_MIGRATE_NONINTERLEAVED 1
+#define MPOL_MIGRATE_INTERLEAVED 2
+#define misplaced_is_interleaved(pol) (MPOL_MIGRATE_INTERLEAVED - 1)
+
+extern int mpol_misplaced(struct page *, struct vm_area_struct *,
+		unsigned long, int *);
+#endif
+
 #else
 
 struct mempolicy {};
--
To unsubscribe from this list: send the line "unsubscribe linux-numa" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]     [Devices]

  Powered by Linux