On Wed, May 19, 2021 at 08:59:05AM -0700, Paul E. McKenney wrote: > On Wed, May 19, 2021 at 02:09:29AM +0200, Frederic Weisbecker wrote: > > At CPU offline time, we make sure to flush any pending wakeup for the > > nocb_gp kthread linked to the outgoing CPU. > > > > Now we are making sure of that twice: > > > > 1) From rcu_report_dead() when the outgoing CPU makes the very last > > local cleanups by itself before switching offline. > > > > 2) From rcutree_dead_cpu(). Here the offlining CPU has gone and is truly > > now offline. Another CPU takes care of post-portem cleaning up and > > check if the offline CPU had pending wakeup. > > > > Both ways are fine but we have to choose one or the other because we > > don't need to repeat that action. Simply benefit from cache locality > > and keep only the first solution. > > But between those two calls, the CPU takes a full pass through the > scheduler and heads into the idle loop. What if there is a call_rcu() > along the way, and if this was the last online CPU in its rcuog kthread's > group of CPUs? Wouldn't that callback be stranded until one of those > CPUs came back online? Nope, rcu_report_dead() is called from the idle path right before arch_cpu_idle_dead(). There should be no call to the scheduler until the CPU comes back online. Thanks!