On Mon, Mar 22, 2021 at 10:46 PM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > On Sat, Mar 20, 2021 at 12:38:14AM +0800, Muchun Song wrote: > > The rcu_read_lock/unlock only can guarantee that the memcg will not be > > freed, but it cannot guarantee the success of css_get (which is in the > > refill_stock when cached memcg changed) to memcg. > > > > rcu_read_lock() > > memcg = obj_cgroup_memcg(old) > > __memcg_kmem_uncharge(memcg) > > refill_stock(memcg) > > if (stock->cached != memcg) > > // css_get can change the ref counter from 0 back to 1. > > css_get(&memcg->css) > > rcu_read_unlock() > > > > This fix is very like the commit: > > > > eefbfa7fd678 ("mm: memcg/slab: fix use after free in obj_cgroup_charge") > > > > Fix this by holding a reference to the memcg which is passed to the > > __memcg_kmem_uncharge() before calling __memcg_kmem_uncharge(). > > > > Fixes: 3de7d4f25a74 ("mm: memcg/slab: optimize objcg stock draining") > > Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx> > > Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx> > > Good catch! Did you trigger the WARN_ON() in > percpu_ref_kill_and_confirm() during testing? No. The race window is very small, it should be difficult to trigger. When I reviewed the code here, I suddenly realized that there might be a problem here. Very coincidental. Thanks.