On 14/10/21 00:07, Boqun Feng wrote: >> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c >> index e38028d48648..b236271b9022 100644 >> --- a/kernel/rcu/tree.c >> +++ b/kernel/rcu/tree.c >> @@ -2717,6 +2717,23 @@ static __latent_entropy void rcu_core(void) >> unsigned long flags; >> struct rcu_data *rdp = raw_cpu_ptr(&rcu_data); >> struct rcu_node *rnp = rdp->mynode; >> + /* >> + * On RT rcu_core() can be preempted when IRQs aren't disabled. >> + * Therefore this function can race with concurrent NOCB (de-)offloading >> + * on this CPU and the below condition must be considered volatile. >> + * However if we race with: >> + * >> + * _ Offloading: In the worst case we accelerate or process callbacks >> + * concurrently with NOCB kthreads. We are guaranteed to >> + * call rcu_nocb_lock() if that happens. > > If offloading races with rcu_core(), can the following happen? > > <offload work> > rcu_nocb_rdp_offload(): > rcu_core(): > ... > rcu_nocb_lock_irqsave(); // no a lock > raw_spin_lock_irqsave(->nocb_lock); > rdp_offload_toggle(): > <LOCKING | OFFLOADED set> > if (!rcu_segcblist_restempty(...)) > rcu_accelerate_cbs_unlocked(...); > rcu_nocb_unlock_irqrestore(); > // ^ a real unlock, > // and will preempt_enable() > // offload continue with ->nocb_lock not held > > If this can happen, it means an unpaired preempt_enable() and an > incorrect unlock. Thoughts? Maybe I'm missing something here? > As Frederic pointed out in an earlier thread [1], this can't happen because we still have IRQs disabled, and the offloading process has to be processed on the CPU being offloaded. IOW, in the above scenario, rcu_core() can't be preempted by the rcu_nocb_rdp_offload() work until it re-enables IRQs at rcu_nocb_unlock_irqrestore(). (hopefully Paul or Frederic will correct me if I've just spewed garbage) [1]: https://lore.kernel.org/lkml/20210930105340.GA232241@lothringen/ > Regards, > Boqun >