On Fri, Sep 30, 2022 at 02:06:48PM +0000, Alexander Fedorov wrote: > Hi, > > reposting this to the mainline list as requested and with updated patch. > > I've encountered a race on kernel version 5.10.131-rt72 when running > LTP cgroup_fj_stress_memory* tests and need help with understanding > synchronization in mm/memcontrol.c, it seems really not-trivial... > Have also checked patches in the latest mainline kernel but do not see > anything similar to the problem. Realtime patch also does not seem to > be the culprit: it just changed preemption to migration disabling and > irq_disable to local_lock. > > It goes as follows: > > 1) First CPU: > css_killed_work_fn() -> mem_cgroup_css_offline() -> > drain_all_stock() -> obj_stock_flush_required() > if (stock->cached_objcg) { > > This check sees a non-NULL pointer for *another* CPU's `memcg_stock` > instance. > > 2) Second CPU: > css_free_rwork_fn() -> __mem_cgroup_free() -> free_percpu() -> > obj_cgroup_uncharge() -> drain_obj_stock() > It frees `cached_objcg` pointer in its own `memcg_stock` instance: > struct obj_cgroup *old = stock->cached_objcg; > < ... > > obj_cgroup_put(old); > stock->cached_objcg = NULL; > > 3) First CPU continues after the 'if' check and re-reads the pointer > again, now it is NULL and dereferencing it leads to kernel panic: > static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, > struct mem_cgroup *root_memcg) > { > < ... > > if (stock->cached_objcg) { > memcg = obj_cgroup_memcg(stock->cached_objcg); Great catch! I'm not sure about switching to rcu primitives though. In all other cases stock->cached_objcg is accessed only from a local cpu, so using rcu_* function is an overkill. How's something about this? (completely untested) Also, please add Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API") Thank you! -- diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b69979c9ced5..93e9637108f0 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3245,10 +3245,18 @@ static struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock) static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, struct mem_cgroup *root_memcg) { + struct obj_cgroup *objcg; struct mem_cgroup *memcg; - if (stock->cached_objcg) { - memcg = obj_cgroup_memcg(stock->cached_objcg); + /* + * stock->cached_objcg can be changed asynchronously, so read + * it using READ_ONCE(). The objcg can't go away though because + * obj_stock_flush_required() is called from within a rcu read + * section. + */ + objcg = READ_ONCE(stock->cached_objcg); + if (objcg) { + memcg = obj_cgroup_memcg(objcg); if (memcg && mem_cgroup_is_descendant(memcg, root_memcg)) return true; } Thank you!