On Tue, Oct 27, 2020 at 04:02:53PM +0800, Muchun Song wrote: > The rcu_read_lock/unlock only can guarantee that the memcg will > not be freed, but it cannot guarantee the success of css_get to > memcg. This can be happened when we reparent the memcg. So using > css_tryget instead of css_get. Hm, I really doubt that it can be reproduced in the real life, but I think you're formally correct. If the whole process of a cgroup offlining is completed between reading a objcg->memcg pointer and bumping the css reference on another CPU, and there are exactly 0 external references to this memory cgroup (how we get to the obj_cgroup_charge() then?), css_get() can change the ref counter from 0 back to 1. Is this a good description of the problem? If so, I think it's worth a more detailed description and mentioning that it's not a real-life bug. It also probably doesn't deserve a stable@ backport. > > Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API") > Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx> > --- > mm/memcontrol.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index fcbd79c5023e..c0c27bee23ad 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3223,8 +3223,10 @@ int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size) > * independently later. > */ > rcu_read_lock(); > +retry: > memcg = obj_cgroup_memcg(objcg); > - css_get(&memcg->css); > + if (!css_tryget(&memcg->css)) if (unlikely(!css_tryget(&memcg->css))) ^^^^^^^^ maybe? > + goto retry; Otherwise the fix looks good to me. Please, feel free to add Acked-by: Roman Gushchin <guro@xxxxxx> after extending the commit description. Thanks!