[PATCH - 3.10 stable backport] memcg: fix endless loop caused by mem_cgroup_iter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Backport is at the end of the email

On Thu 06-02-14 15:38:34, gregkh@xxxxxxxxxxxxxxxxxxx wrote:
> 
> The patch below does not apply to the 3.10-stable tree.
> If someone wants it applied there, or to any other stable or longterm
> tree, then please email the backport, including the original git commit
> id to <stable@xxxxxxxxxxxxxxx>.
> 
> thanks,
> 
> greg k-h
> 
> ------------------ original commit in Linus's tree ------------------
> 
> From ecc736fc3c71c411a9d201d8588c9e7e049e5d8c Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@xxxxxxx>
> Date: Thu, 23 Jan 2014 15:53:35 -0800
> Subject: [PATCH] memcg: fix endless loop caused by mem_cgroup_iter
> 
> Hugh has reported an endless loop when the hardlimit reclaim sees the
> same group all the time.  This might happen when the reclaim races with
> the memcg removal.
> 
> shrink_zone
>                                                 [rmdir root]
>   mem_cgroup_iter(root, NULL, reclaim)
>     // prev = NULL
>     rcu_read_lock()
>     mem_cgroup_iter_load
>       last_visited = iter->last_visited   // gets root || NULL
>       css_tryget(last_visited)            // failed
>       last_visited = NULL                 [1]
>     memcg = root = __mem_cgroup_iter_next(root, NULL)
>     mem_cgroup_iter_update
>       iter->last_visited = root;
>     reclaim->generation = iter->generation
> 
>  mem_cgroup_iter(root, root, reclaim)
>    // prev = root
>    rcu_read_lock
>     mem_cgroup_iter_load
>       last_visited = iter->last_visited   // gets root
>       css_tryget(last_visited)            // failed
>     [1]
> 
> The issue seemed to be introduced by commit 5f5781619718 ("memcg: relax
> memcg iter caching") which has replaced unconditional css_get/css_put by
> css_tryget/css_put for the cached iterator.
> 
> This patch fixes the issue by skipping css_tryget on the root of the
> tree walk in mem_cgroup_iter_load and symmetrically doesn't release it
> in mem_cgroup_iter_update.
> 
> Signed-off-by: Michal Hocko <mhocko@xxxxxxx>
> Reported-by: Hugh Dickins <hughd@xxxxxxxxxx>
> Tested-by: Hugh Dickins <hughd@xxxxxxxxxx>
> Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
> Cc: Greg Thelen <gthelen@xxxxxxxxxx>
> Cc: <stable@xxxxxxxxxxxxxxx>	[3.10+]
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c8336e8f8df0..da07784dde87 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1158,7 +1158,15 @@ mem_cgroup_iter_load(struct mem_cgroup_reclaim_iter *iter,
>  	if (iter->last_dead_count == *sequence) {
>  		smp_rmb();
>  		position = iter->last_visited;
> -		if (position && !css_tryget(&position->css))
> +
> +		/*
> +		 * We cannot take a reference to root because we might race
> +		 * with root removal and returning NULL would end up in
> +		 * an endless loop on the iterator user level when root
> +		 * would be returned all the time.
> +		 */
> +		if (position && position != root &&
> +				!css_tryget(&position->css))
>  			position = NULL;
>  	}
>  	return position;
> @@ -1167,9 +1175,11 @@ mem_cgroup_iter_load(struct mem_cgroup_reclaim_iter *iter,
>  static void mem_cgroup_iter_update(struct mem_cgroup_reclaim_iter *iter,
>  				   struct mem_cgroup *last_visited,
>  				   struct mem_cgroup *new_position,
> +				   struct mem_cgroup *root,
>  				   int sequence)
>  {
> -	if (last_visited)
> +	/* root reference counting symmetric to mem_cgroup_iter_load */
> +	if (last_visited && last_visited != root)
>  		css_put(&last_visited->css);
>  	/*
>  	 * We store the sequence count from the time @last_visited was
> @@ -1244,7 +1254,8 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
>  		memcg = __mem_cgroup_iter_next(root, last_visited);
>  
>  		if (reclaim) {
> -			mem_cgroup_iter_update(iter, last_visited, memcg, seq);
> +			mem_cgroup_iter_update(iter, last_visited, memcg, root,
> +					seq);
>  
>  			if (!memcg)
>  				iter->generation++;
> 

>From 17b70365e80f1a43b8d4a6b575c8c0dff9cb65e7 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@xxxxxxx>
Date: Thu, 23 Jan 2014 15:53:35 -0800
Subject: [PATCH] memcg: fix endless loop caused by mem_cgroup_iter

Upstream commit ecc736fc3c71c411a9d201d8588c9e7e049e5d8c

Hugh has reported an endless loop when the hardlimit reclaim sees the
same group all the time.  This might happen when the reclaim races with
the memcg removal.

shrink_zone
                                                [rmdir root]
  mem_cgroup_iter(root, NULL, reclaim)
    // prev = NULL
    rcu_read_lock()
    mem_cgroup_iter_load
      last_visited = iter->last_visited   // gets root || NULL
      css_tryget(last_visited)            // failed
      last_visited = NULL                 [1]
    memcg = root = __mem_cgroup_iter_next(root, NULL)
    mem_cgroup_iter_update
      iter->last_visited = root;
    reclaim->generation = iter->generation

 mem_cgroup_iter(root, root, reclaim)
   // prev = root
   rcu_read_lock
    mem_cgroup_iter_load
      last_visited = iter->last_visited   // gets root
      css_tryget(last_visited)            // failed
    [1]

The issue seemed to be introduced by commit 5f5781619718 ("memcg: relax
memcg iter caching") which has replaced unconditional css_get/css_put by
css_tryget/css_put for the cached iterator.

This patch fixes the issue by skipping css_tryget on the root of the
tree walk in mem_cgroup_iter_load and symmetrically doesn't release it
in mem_cgroup_iter_update.

Signed-off-by: Michal Hocko <mhocko@xxxxxxx>
Reported-by: Hugh Dickins <hughd@xxxxxxxxxx>
Tested-by: Hugh Dickins <hughd@xxxxxxxxxx>
Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
Cc: Greg Thelen <gthelen@xxxxxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx>	[3.10+]
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
---
 mm/memcontrol.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 134e2106f467..6115b2bbd6ea 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1220,7 +1220,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
 			if (dead_count == iter->last_dead_count) {
 				smp_rmb();
 				last_visited = iter->last_visited;
-				if (last_visited &&
+				if (last_visited && last_visited != root &&
 				    !css_tryget(&last_visited->css))
 					last_visited = NULL;
 			}
@@ -1229,7 +1229,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
 		memcg = __mem_cgroup_iter_next(root, last_visited);
 
 		if (reclaim) {
-			if (last_visited)
+			if (last_visited && last_visited != root)
 				css_put(&last_visited->css);
 
 			iter->last_visited = memcg;
-- 
1.9.rc1

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]