On Wed, Feb 09, 2022 at 11:53:33PM +0530, Mukesh Ojha wrote: > > On 2/5/2022 4:25 AM, Paul E. McKenney wrote: > > Although it is usually safe to invoke synchronize_rcu_expedited() from a > > preemption-enabled CPU-hotplug notifier, if it is invoked from a notifier > > between CPUHP_AP_RCUTREE_ONLINE and CPUHP_AP_ACTIVE, its attempts to > > invoke a workqueue handler will hang due to RCU waiting on a CPU that > > the scheduler is not paying attention to. This commit therefore expands > > use of the existing workqueue-independent synchronize_rcu_expedited() > > from early boot to also include CPUs that are being hotplugged. > > > > Link: https://lore.kernel.org/lkml/7359f994-8aaf-3cea-f5cf-c0d3929689d6@xxxxxxxxxxx/ > > Reported-by: Mukesh Ojha <quic_mojha@xxxxxxxxxxx> > > Cc: Tejun Heo <tj@xxxxxxxxxx> > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> > > --- > > kernel/rcu/tree_exp.h | 14 ++++++++++---- > > 1 file changed, 10 insertions(+), 4 deletions(-) > > > > diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h > > index 60197ea24ceb9..1a45667402260 100644 > > --- a/kernel/rcu/tree_exp.h > > +++ b/kernel/rcu/tree_exp.h > > @@ -816,7 +816,7 @@ static int rcu_print_task_exp_stall(struct rcu_node *rnp) > > */ > > void synchronize_rcu_expedited(void) > > { > > - bool boottime = (rcu_scheduler_active == RCU_SCHEDULER_INIT); > > + bool no_wq; > > struct rcu_exp_work rew; > > struct rcu_node *rnp; > > unsigned long s; > > @@ -841,9 +841,15 @@ void synchronize_rcu_expedited(void) > > if (exp_funnel_lock(s)) > > return; /* Someone else did our work for us. */ > > + /* Don't use workqueue during boot or from an incoming CPU. */ > > + preempt_disable(); > > + no_wq = rcu_scheduler_active == RCU_SCHEDULER_INIT || > > + !cpumask_test_cpu(smp_processor_id(), cpu_active_mask); > > + preempt_enable(); > > + > > /* Ensure that load happens before action based on it. */ > > - if (unlikely(boottime)) { > > - /* Direct call during scheduler init and early_initcalls(). */ > > + if (unlikely(no_wq)) { > > + /* Direct call for scheduler init, early_initcall()s, and incoming CPUs. */ > > rcu_exp_sel_wait_wake(s); > > } else { > > /* Marshall arguments & schedule the expedited grace period. */ > > @@ -861,7 +867,7 @@ void synchronize_rcu_expedited(void) > > /* Let the next expedited grace period start. */ > > mutex_unlock(&rcu_state.exp_mutex); > > - if (likely(!boottime)) > > + if (likely(!no_wq)) > > destroy_work_on_stack(&rew.rew_work); > > } > > EXPORT_SYMBOL_GPL(synchronize_rcu_expedited); > > Can we reach a condition after this change where no_wq = true and during > rcu_stall report where exp_task = 0 list and exp_mask contain only this cpu > ? Hello, Mukesh, and thank you for looking this over! At first glance, I do not believe that this can happen because the expedited grace-period machinery avoids waiting on the current CPU. (See sync_rcu_exp_select_node_cpus(), both the raw_smp_processor_id() early in the function and the get_cpu() later in the function.) But please let me know if I am missing something here. But suppose that we could in fact reach this condition. What bad thing would happen? Other than a resched_cpu() having been invoked several times on a not-yet-online CPU, of course. ;-) Thanx, Paul