Re: FAILED: patch "[PATCH] memcg: reparent charges of children before processing parent" failed to apply to 3.13-stable tree

Luís Henriques <luis.henriques@xxxxxxxxxxxxx> · Fri, 21 Mar 2014 11:23:51 +0000

On Thu, Mar 20, 2014 at 09:40:20PM -0700, Hugh Dickins wrote:
> On Thu, 20 Mar 2014, Greg KH wrote:
> > On Wed, Mar 19, 2014 at 10:59:27PM -0700, Hugh Dickins wrote:
> > > On Wed, 19 Mar 2014, gregkh@xxxxxxxxxxxxxxxxxxx wrote:
> > > > 
> > > > The patch below does not apply to the 3.13-stable tree.
> > > > If someone wants it applied there, or to any other stable or longterm
> > > > tree, then please email the backport, including the original git commit
> > > > id to <stable@xxxxxxxxxxxxxxx>.
> > > > 
> > > > thanks,
> > > > 
> > > > greg k-h
> > > > 
> > > ------------------ version for 3.13.7 ------------------
> > 
> > Thanks, now applied.
> 
> Thanks for including this in the 3.13.7-stable review series, Greg.
> But I notice there wasn't one in the 3.10.34-stable review series:
> I now think that I should have interpreted your FAILED on 3.13 mail
> as a prompt to send equivalents for the earlier releases too, sorry.
> 
> The version for Jiri's 3.12.15 would be the same as that for 3.13.7.
> 
> But the version for 3.10.34 (or perhaps now 3.10.35) is this below.
> Yes, more differences, and the old mem_cgroup_reparent_charges line
> is intentionally left in for 3.10 whereas it was removed for 3.12+:
> that's because the css/cgroup iterator changed in between, it used
> not to supply the root of the subtree, but nowadays it does.
> 
> Thanks,
> Hugh
> 
> ------------------ version for 3.10.34 ------------------
> 
> From 4fb1a86fb5e4209a7d4426d4e586c58e9edc74ac Mon Sep 17 00:00:00 2001
> From: Filipe Brandenburger <filbranden@xxxxxxxxxx>
> Date: Mon, 3 Mar 2014 15:38:25 -0800
> Subject: [PATCH] memcg: reparent charges of children before processing parent
> 
> Sometimes the cleanup after memcg hierarchy testing gets stuck in
> mem_cgroup_reparent_charges(), unable to bring non-kmem usage down to 0.
> 
> There may turn out to be several causes, but a major cause is this: the
> workitem to offline parent can get run before workitem to offline child;
> parent's mem_cgroup_reparent_charges() circles around waiting for the
> child's pages to be reparented to its lrus, but it's holding
> cgroup_mutex which prevents the child from reaching its
> mem_cgroup_reparent_charges().
> 
> Further testing showed that an ordered workqueue for cgroup_destroy_wq
> is not always good enough: percpu_ref_kill_and_confirm's call_rcu_sched
> stage on the way can mess up the order before reaching the workqueue.
> 
> Instead, when offlining a memcg, call mem_cgroup_reparent_charges() on
> all its children (and grandchildren, in the correct order) to have their
> charges reparented first.
> 
> Fixes: e5fca243abae ("cgroup: use a dedicated workqueue for cgroup destruction")
> Signed-off-by: Filipe Brandenburger <filbranden@xxxxxxxxxx>
> Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
> Reviewed-by: Tejun Heo <tj@xxxxxxxxxx>
> Acked-by: Michal Hocko <mhocko@xxxxxxx>
> Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
> Cc: <stable@xxxxxxxxxxxxxxx>	[v3.10+]
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> 
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -6326,9 +6326,23 @@ static void mem_cgroup_invalidate_reclai
>  static void mem_cgroup_css_offline(struct cgroup *cont)
>  {
>  	struct mem_cgroup *memcg = mem_cgroup_from_cont(cont);
> +	struct cgroup *iter;
>  
>  	mem_cgroup_invalidate_reclaim_iterators(memcg);
> +
> +	/*
> +	 * This requires that offlining is serialized.  Right now that is
> +	 * guaranteed because css_killed_work_fn() holds the cgroup_mutex.
> +	 */
> +	rcu_read_lock();
> +	cgroup_for_each_descendant_post(iter, cont) {
> +		rcu_read_unlock();
> +		mem_cgroup_reparent_charges(mem_cgroup_from_cont(iter));
> +		rcu_read_lock();
> +	}
> +	rcu_read_unlock();
>  	mem_cgroup_reparent_charges(memcg);

Is this correct? ^^^

I may be missing something, but I believe this call to
mem_cgroup_reparent_charges() should be dropped (as in the original commit
and in your 3.13 backport).

Cheers,
--
Luís

> +
>  	mem_cgroup_destroy_all_caches(memcg);
>  }
>  
> --
> To unsubscribe from this list: send the line "unsubscribe stable" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html