[PATCH 3/15] Mempolicy: Write lock mmap_sem while changing task mempolicy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



PATCH 03/15 Mempolicy: Write lock mmap_sem while changing task mempolicy

Against:  2.6.25-rc8-mm1

A read of /proc/<pid>/numa_maps holds the target task's mmap_sem
for read while examining each vma's mempolicy.  A vma's mempolicy
can fall back to the task's policy.  However, the task could be
changing it's task policy and free the one that the show_numa_maps()
is examining.

To prevent this, grab the mmap_sem for write when updating task
mempolicy.   Pointed out to me by Christoph Lameter and extracted
and reworked from Christoph's alternative mempol reference counting
patch.

This is analogous to the way that do_mbind() and do_get_mempolicy()
prevent races between task's sharing an mm_struct [a.k.a. threads]
setting and querying a mempolicy for a particular address.

Note:  this is necessary, but not sufficient, to allow us to stop
taking an extra reference on "other task's mempolicy" in get_vma_policy.
Subsequent patches will complete this update, allowing us to simplify
the tests for whether we need to unref a mempolicy at various points
in the code.

Signed-off-by:  Lee Schermerhorn <lee.schermerhorn@xxxxxx>

 mm/mempolicy.c |   13 +++++++++++++
 1 file changed, 13 insertions(+)

Index: linux-2.6.25-rc8-mm1/mm/mempolicy.c
===================================================================
--- linux-2.6.25-rc8-mm1.orig/mm/mempolicy.c	2008-04-02 17:32:24.000000000 -0400
+++ linux-2.6.25-rc8-mm1/mm/mempolicy.c	2008-04-02 17:32:35.000000000 -0400
@@ -591,16 +591,29 @@ static long do_set_mempolicy(unsigned sh
 			     nodemask_t *nodes)
 {
 	struct mempolicy *new;
+	struct mm_struct *mm = current->mm;
 
 	new = mpol_new(mode, flags, nodes);
 	if (IS_ERR(new))
 		return PTR_ERR(new);
+
+	/*
+	 * prevent changing our mempolicy while show_numa_maps()
+	 * is using it.
+	 * Note:  do_set_mempolicy() can be called at init time
+	 * with no 'mm'.
+	 */
+	if (mm)
+		down_write(&mm->mmap_sem);
 	mpol_put(current->mempolicy);
 	current->mempolicy = new;
 	mpol_set_task_struct_flag();
 	if (new && new->policy == MPOL_INTERLEAVE &&
 	    nodes_weight(new->v.nodes))
 		current->il_next = first_node(new->v.nodes);
+	if (mm)
+		up_write(&mm->mmap_sem);
+
 	return 0;
 }
 
--
To unsubscribe from this list: send the line "unsubscribe linux-numa" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]     [Devices]

  Powered by Linux