On Fri, Oct 20, 2023 at 01:35:42AM +0200, Frederic Weisbecker wrote: > The commit: > > cff9b2332ab7 ("kernel/sched: Modify initial boot task idle setup") > > fixed an issue where rcutiny would request a quiescent state with > setting TIF_NEED_RESCHED in early boot when init/0 has the PF_IDLE flag > set but interrupts aren't enabled yet. A subsequent call to > cond_resched() would then enable IRQs too early. > > When callbacks are enqueued in idle, RCU currently performs the > following: > > 1) Call resched_cpu() to trigger exit from idle and go through the > scheduler to call rcu_note_context_switch() -> rcu_qs() > > 2) rcu_qs() notes the quiescent state and raises RCU_SOFTIRQ if there > is a callback, waking up ksoftirqd since it isn't called from an > interrupt. > > However the call to resched_cpu() can opportunistically be replaced and > optimized with raising RCU_SOFTIRQ and forcing ksoftirqd wakeup instead. > > It's worth noting that RCU grace period polling while idle is then > suboptimized but such a usecase can be considered very rare or even > non-existent. > > The advantage of this optimization is that it also works if PF_IDLE is > set early because ksoftirqd is created way after IRQs are enabled on > boot and it can't be awaken before its creation. If > raise_ksoftirqd_irqoff() is called after the first scheduling point > but before kostfirqd is created, nearby voluntary schedule calls are > expected to provide the desired quiescent state and in the worst case > the first launch of ksoftirqd is close enough on the first initcalls. > > Fixes: cff9b2332ab7 ("kernel/sched: Modify initial boot task idle setup") > Cc: Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> > Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > Cc: Sebastian Siewior <bigeasy@xxxxxxxxxxxxx> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx> Reviewed-by: Paul E. McKenney <paulmck@xxxxxxxxxx> I did a touch-test of the series, and have started overnight testing. Thanx, Paul > --- > kernel/rcu/tiny.c | 21 ++++++++++++++++----- > 1 file changed, 16 insertions(+), 5 deletions(-) > > diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c > index fec804b79080..9460e4e9d84c 100644 > --- a/kernel/rcu/tiny.c > +++ b/kernel/rcu/tiny.c > @@ -190,12 +190,15 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func) > local_irq_save(flags); > *rcu_ctrlblk.curtail = head; > rcu_ctrlblk.curtail = &head->next; > - local_irq_restore(flags); > > if (unlikely(is_idle_task(current))) { > - /* force scheduling for rcu_qs() */ > - resched_cpu(0); > + /* > + * Force resched to trigger a QS and handle callbacks right after. > + * This also takes care of avoiding too early rescheduling on boot. > + */ > + raise_ksoftirqd_irqoff(RCU_SOFTIRQ); > } > + local_irq_restore(flags); > } > EXPORT_SYMBOL_GPL(call_rcu); > > @@ -228,8 +231,16 @@ unsigned long start_poll_synchronize_rcu(void) > unsigned long gp_seq = get_state_synchronize_rcu(); > > if (unlikely(is_idle_task(current))) { > - /* force scheduling for rcu_qs() */ > - resched_cpu(0); > + unsigned long flags; > + > + /* > + * Force resched to trigger a QS. This also takes care of avoiding > + * too early rescheduling on boot. It's suboptimized but GP > + * polling on idle isn't expected much as a usecase. > + */ > + local_irq_save(flags); > + raise_ksoftirqd_irqoff(RCU_SOFTIRQ); > + local_irq_restore(flags); > } > return gp_seq; > } > -- > 2.34.1 >