On Fri, 2022-07-15 at 06:47 -1000, Tejun Heo wrote: > (resending, I messed up the message header, sorry) > > Hello, > > On Fri, Jul 15, 2022 at 01:59:38PM +0200, Michal Koutný wrote: > > The css->rstat_css_node should not be modified if there are > > possible RCU > > readers elsewhere. > > One way to fix this would be to insert synchronize_rcu() after > > list_del_rcu() and before list_add_rcu(). > > (A further alternative (I've heard about) would be to utilize > > 'nulls' > > RCU lists [1] to make the move between lists detectable.) > > > > But as I'm looking at it from distance, it may be simpler and > > sufficient > > to just take cgroup_rstat_lock around the list migration (the > > nesting > > under cgroup_mutex that's held with rebind_subsystems() is fine). > > synchronize_rcu() prolly is the better fit here given how that > list_node's > usage, but yeah, great find. > > Thanks. > Hi Michal and Tejun, Thanks for your suggestion. Accroding your description, is the following patch corrent? --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -1813,6 +1813,7 @@ if (ss->css_rstat_flush) { list_del_rcu(&css->rstat_css_node); + synchronize_rcu(); list_add_rcu(&css->rstat_css_node, &dcgrp->rstat_css_list); } If the patch is correct, we will add this patch to our stability test. And we will continue to observe whether the problem is solved. Thank you. Best regards, Jing-Ting Wu