Re: [RFC] memcg: fix race between css_offline and async charge (was: Re: Possible regression with cgroups in 3.11)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 13, 2013 at 04:13:39PM +0100, Michal Hocko wrote:
> On Wed 13-11-13 09:54:27, Johannes Weiner wrote:
> > On Wed, Nov 13, 2013 at 02:23:37PM +0100, Michal Hocko wrote:
> > > Johannes, what do you think about something like this?
> > > I have just compile tested it.
> > > --- 
> > > >From 73042adc905847bfe401ae12073d1c479db8fdab Mon Sep 17 00:00:00 2001
> > > From: Michal Hocko <mhocko@xxxxxxx>
> > > Date: Wed, 13 Nov 2013 13:53:54 +0100
> > > Subject: [PATCH] memcg: fix race between css_offline and async charge
> > > 
> > > As correctly pointed out by Johannes, charges done on behalf of a group
> > > where the current task is not its member (swap accounting currently) are
> > > racy wrt. memcg offlining (mem_cgroup_css_offline).
> > > 
> > > This might lead to a charge leak as follows:
> > > 
> > >                            rcu_read_lock()
> > >                            css_tryget()
> > >                            rcu_read_unlock()
> > > disable tryget()
> > > call_rcu()
> > >   offline_css()
> > >     reparent_charges()
> > >                            res_counter_charge()
> > >                            css_put()
> > >                              css_free()
> > >                            pc->mem_cgroup = deadcg
> > >                            add page to lru
> > > 
> > > If a group has a parent then the parent's res_counter would have a
> > > charge which doesn't have any corresponding page on any reachable LRUs
> > > under its hierarchy and so it won't be able to free/reparent its own
> > > charges when going away and end up looping in reparent_charges for ever.
> > > 
> > > This patch fixes the issue by introducing memcg->offline flag which is
> > > set when memcg is offlined (and the memcg is not reachable anymore).
> > > 
> > > The only async charger we have currently (swapin accounting path) checks
> > > the offline status after successful charge and uncharges and falls back
> > > to charge the current task if the group is offline now.
> > > 
> > > Spotted-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> > > Signed-off-by: Michal Hocko <mhocko@xxxxxxx>
> > 
> > Ultimately, we want to have the tryget and the res_counter charge in
> > the same rcu readlock region because cgroup already provides rcu
> > protection.  We need a quick fix until then.
> > 
> > So I'm not sure why you are sending more patches, I already provided a
> > one-liner change that should take care of this and you didn't say why
> > it wouldn't work.
> 
> I've completely forgot about that one. Sorry about that! Yes, that one
> will work as well (it would be sufficient to call
> mem_cgroup_reparent_charges only if there are any charges left). Both
> approaches are hacks so I do not care either way.

Yes, I think we could turn the reparent_charges loop into a

  while (res_counter_read...)

loop.  This would actually make more sense regardless of my patch.

Thanks for the ack in the other email.
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux