On Mon, Dec 14, 2020 at 02:37:16PM -0800, Yang Shi wrote: > The shrink_slab_memcg() races with mem_cgroup_css_online(). A visibility of CSS_ONLINE flag > in shrink_slab_memcg()->mem_cgroup_online() does not guarantee that we will see > memcg->nodeinfo[nid]->shrinker_maps != NULL. This may occur because of processor reordering > on !x86. > > This seems like the below case: > > CPU A CPU B > store shrinker_map load CSS_ONLINE > store CSS_ONLINE load shrinker_map > > So the memory ordering could be guaranteed by smp_wmb()/smp_rmb() pair. > > The memory barriers pair will guarantee the ordering between shrinker_deferred and CSS_ONLINE > for the following patches as well. This should not require memory barriers in the shrinker code. The code that sets and checks the CSS_ONLINE flag should have the memory barriers to ensure that anything that sees an online CSS will see it completely set up. That is, the functions online_css() that set the CSS_ONLINE needs a memory barrier to ensure all previous writes are completed before the CSS_ONLINE flag is set, and the function mem_cgroup_online() needs a barrier to pair with that. This is the same existence issue that the superblock shrinkers have with the shrinkers being registered before the superblock is fully set up. The SB_BORN flag on the sueprblock indicates the superblock is now fully set up ("online" in CSS speak) and the registered shrinker can run. Please see the smp_wmb() before we set SB_BORN in vfs_get_tree(), and the big comment about the smp_rmb() -after- we check SB_BORN in super_cache_count() to understand the details of the data dependency between the flag and the structures being set up that the barriers enforce. IOWs, these memory barriers belong inside the cgroup code to guarantee anything that sees an online cgroup will always see the fully initialised cgroup structures. They do not belong in the shrinker infrastructure... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx