Re: [RFC] memcg: fix race between css_offline and async charge (was: Re: Possible regression with cgroups in 3.11)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 13, 2013 at 02:23:37PM +0100, Michal Hocko wrote:
> Johannes, what do you think about something like this?
> I have just compile tested it.
> --- 
> >From 73042adc905847bfe401ae12073d1c479db8fdab Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@xxxxxxx>
> Date: Wed, 13 Nov 2013 13:53:54 +0100
> Subject: [PATCH] memcg: fix race between css_offline and async charge
> 
> As correctly pointed out by Johannes, charges done on behalf of a group
> where the current task is not its member (swap accounting currently) are
> racy wrt. memcg offlining (mem_cgroup_css_offline).
> 
> This might lead to a charge leak as follows:
> 
>                            rcu_read_lock()
>                            css_tryget()
>                            rcu_read_unlock()
> disable tryget()
> call_rcu()
>   offline_css()
>     reparent_charges()
>                            res_counter_charge()
>                            css_put()
>                              css_free()
>                            pc->mem_cgroup = deadcg
>                            add page to lru
> 
> If a group has a parent then the parent's res_counter would have a
> charge which doesn't have any corresponding page on any reachable LRUs
> under its hierarchy and so it won't be able to free/reparent its own
> charges when going away and end up looping in reparent_charges for ever.
> 
> This patch fixes the issue by introducing memcg->offline flag which is
> set when memcg is offlined (and the memcg is not reachable anymore).
> 
> The only async charger we have currently (swapin accounting path) checks
> the offline status after successful charge and uncharges and falls back
> to charge the current task if the group is offline now.
> 
> Spotted-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> Signed-off-by: Michal Hocko <mhocko@xxxxxxx>

Ultimately, we want to have the tryget and the res_counter charge in
the same rcu readlock region because cgroup already provides rcu
protection.  We need a quick fix until then.

So I'm not sure why you are sending more patches, I already provided a
one-liner change that should take care of this and you didn't say why
it wouldn't work.
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux