[PATCH 1/2] mm/memcontrol: check cmpxchg(iter->pos...) result in mem_cgroup_iter()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Check the return value of cmpxchg when updating iter->position in
mem_cgroup_iter().  If cmpxchg failed, i.e. a different thread won
the race to update iter->position, then restart the entire flow of
reading, processing and updating iter->position.  Simply ensuring
that there aren't multiple writes to iter->position doesn't avoid
redundant reclaims of a memcg, as competing threads will compute
the same memcg given the same iter->position.

The cmpxchg will only fail if a different thread saw the same value
of iter->position, meaning it called css_next_descendant_pre() with
the same css and therefore computed the same memcg (ignoring the
corner case where the threads see different versions of the tree).

Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
---
 mm/memcontrol.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 16c556a..6a7ca3c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -758,6 +758,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
 
 	rcu_read_lock();
 
+start:
 	if (reclaim) {
 		struct mem_cgroup_per_node *mz;
 
@@ -818,11 +819,27 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
 
 	if (reclaim) {
 		/*
-		 * The position could have already been updated by a competing
-		 * thread, so check that the value hasn't changed since we read
-		 * it to avoid reclaiming from the same cgroup twice.
+		 * Competing reclaim threads may attempt to consume the same
+		 * iter->position, check that the value hasn't changed since
+		 * we read it to avoid reclaiming from the same cgroup twice.
+		 * Note that just avoiding multiple writes to iter->position
+		 * does not prevent redundant reclaims to memcg.  Given the
+		 * same input css on competing threads, the css returned by
+		 * css_next_descendant_pre will also be the same (unless the
+		 * tree itself changes).  So, if a different thread read the
+		 * same iter->position, then it also computed the same memcg.
+		 * If we lost the race, put our css references and restart
+		 * the entire process of reading and updating iter->position.
 		 */
-		(void)cmpxchg(&iter->position, pos, memcg);
+		if (cmpxchg(&iter->position, pos, memcg) != pos) {
+			if (memcg && memcg != root)
+				css_put(&memcg->css);
+			if (pos)
+				css_put(&pos->css);
+			css = NULL;
+			memcg = NULL;
+			goto start;
+		}
 
 		if (pos)
 			css_put(&pos->css);
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux