+ mm-cpuset-always-use-seqlock-when-changing-tasks-nodemask.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: mm, cpuset: always use seqlock when changing task's nodemask
has been added to the -mm tree.  Its filename is
     mm-cpuset-always-use-seqlock-when-changing-tasks-nodemask.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-cpuset-always-use-seqlock-when-changing-tasks-nodemask.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-cpuset-always-use-seqlock-when-changing-tasks-nodemask.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/SubmitChecklist when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Vlastimil Babka <vbabka@xxxxxxx>
Subject: mm, cpuset: always use seqlock when changing task's nodemask

When updating task's mems_allowed and rebinding its mempolicy due to
cpuset's mems being changed, we currently only take the seqlock for
writing when either the task has a mempolicy, or the new mems has no
intersection with the old mems.  This should be enough to prevent a
parallel allocation seeing no available nodes, but the optimization is
IMHO unnecessary (cpuset updates should not be frequent), and we still
potentially risk issues if the intersection of new and old nodes has
limited amount of free/reclaimable memory.  Let's just use the seqlock for
all tasks.

Link: http://lkml.kernel.org/r/20170517081140.30654-6-vbabka@xxxxxxx
Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
Acked-by: Michal Hocko <mhocko@xxxxxxxx>
Cc: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Cc: Anshuman Khandual <khandual@xxxxxxxxxxxxxxxxxx>
Cc: Christoph Lameter <cl@xxxxxxxxx>
Cc: David Rientjes <rientjes@xxxxxxxxxx>
Cc: Dimitri Sivanich <sivanich@xxxxxxx>
Cc: Hugh Dickins <hughd@xxxxxxxxxx>
Cc: Li Zefan <lizefan@xxxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 kernel/cgroup/cpuset.c |   29 ++++++++---------------------
 1 file changed, 8 insertions(+), 21 deletions(-)

diff -puN kernel/cgroup/cpuset.c~mm-cpuset-always-use-seqlock-when-changing-tasks-nodemask kernel/cgroup/cpuset.c
--- a/kernel/cgroup/cpuset.c~mm-cpuset-always-use-seqlock-when-changing-tasks-nodemask
+++ a/kernel/cgroup/cpuset.c
@@ -1038,38 +1038,25 @@ static void cpuset_post_attach(void)
  * @tsk: the task to change
  * @newmems: new nodes that the task will be set
  *
- * In order to avoid seeing no nodes if the old and new nodes are disjoint,
- * we structure updates as setting all new allowed nodes, then clearing newly
- * disallowed ones.
+ * We use the mems_allowed_seq seqlock to safely update both tsk->mems_allowed
+ * and rebind an eventual tasks' mempolicy. If the task is allocating in
+ * parallel, it might temporarily see an empty intersection, which results in
+ * a seqlock check and retry before OOM or allocation failure.
  */
 static void cpuset_change_task_nodemask(struct task_struct *tsk,
 					nodemask_t *newmems)
 {
-	bool need_loop;
-
 	task_lock(tsk);
-	/*
-	 * Determine if a loop is necessary if another thread is doing
-	 * read_mems_allowed_begin().  If at least one node remains unchanged and
-	 * tsk does not have a mempolicy, then an empty nodemask will not be
-	 * possible when mems_allowed is larger than a word.
-	 */
-	need_loop = task_has_mempolicy(tsk) ||
-			!nodes_intersects(*newmems, tsk->mems_allowed);
 
-	if (need_loop) {
-		local_irq_disable();
-		write_seqcount_begin(&tsk->mems_allowed_seq);
-	}
+	local_irq_disable();
+	write_seqcount_begin(&tsk->mems_allowed_seq);
 
 	nodes_or(tsk->mems_allowed, tsk->mems_allowed, *newmems);
 	mpol_rebind_task(tsk, newmems);
 	tsk->mems_allowed = *newmems;
 
-	if (need_loop) {
-		write_seqcount_end(&tsk->mems_allowed_seq);
-		local_irq_enable();
-	}
+	write_seqcount_end(&tsk->mems_allowed_seq);
+	local_irq_enable();
 
 	task_unlock(tsk);
 }
_

Patches currently in -mm which might be from vbabka@xxxxxxx are

mm-page_alloc-fix-more-premature-oom-due-to-race-with-cpuset-update.patch
mm-mempolicy-stop-adjusting-current-il_next-in-mpol_rebind_nodemask.patch
mm-page_alloc-pass-preferred-nid-instead-of-zonelist-to-allocator.patch
mm-mempolicy-simplify-rebinding-mempolicies-when-updating-cpusets.patch
mm-cpuset-always-use-seqlock-when-changing-tasks-nodemask.patch
mm-mempolicy-dont-check-cpuset-seqlock-where-it-doesnt-matter.patch

--
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux