On Fri, Mar 29, 2019 at 11:26:34AM -0700, Paul E. McKenney wrote: > When RCU core processing is offloaded from RCU_SOFTIRQ to the rcuc > kthreads, a full and unconditional wakeup is required to initiate RCU > core processing. In contrast, when RCU core processing is carried > out by RCU_SOFTIRQ, a raise_softirq() suffices. Of course, there are > situations where raise_softirq() does a full wakeup, but these do not > occur with normal usage of rcu_read_unlock(). Do we have a comment somewhere explaining why? > The initial solution to this problem was to use set_tsk_need_resched() and > set_preempt_need_resched() to force a future context switch, which allows > rcu_preempt_note_context_switch() to report the deferred quiescent state > to RCU's core processing. Unfortunately for expedited grace periods, > there can be a significant delay between the call for a context switch > and the actual context switch. This is all PREEMPT=y kernels, right? Where is the latency coming from? Because PREEMPT=y _should_ react quite quickly. > This commit therefore introduces a ->deferred_qs flag to the task_struct > structure's rcu_special structure. This flag is initially false, and > is set to true by the first call to rcu_read_unlock() requiring special > attention, then finally reset back to false when the quiescent state is > finally reported. Then rcu_read_unlock() attempts full wakeups only when > ->deferred_qs is false, that is, on the first rcu_read_unlock() requiring > special attention. Note that a chain of RCU readers linked by some other > sort of reader may find that a later rcu_read_unlock() is once again able > to do a full wakeup, courtesy of an intervening preemption: > > rcu_read_lock(); > /* preempted */ > local_irq_disable(); > rcu_read_unlock(); /* Can do full wakeup, sets ->deferred_qs. */ > rcu_read_lock(); > local_irq_enable(); > preempt_disable() > rcu_read_unlock(); /* Cannot do full wakeup, ->deferred_qs set. */ > rcu_read_lock(); > preempt_enable(); > /* preempted, >deferred_qs reset. */ As it would have without ->deferred_sq and just having done the above which was deemed insufficient. So I'm really puzzled by the need for all this. > local_irq_disable(); > rcu_read_unlock(); /* Can again do full wakeup, sets ->deferred_qs. */ > > Such linked RCU readers do not yet seem to appear in the Linux kernel, and > it is probably best if they don't. However, RCU needs to handle them, and > some variations on this theme could make even raise_softirq() unsafe due to > the possibility of its doing a full wakeup. This commit therefore also > avoids invoking raise_softirq() when the ->deferred_qs set flag is set. > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxx> > Cc: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> > --- > include/linux/sched.h | 2 +- > kernel/rcu/tree_plugin.h | 19 ++++++++++++++----- > 2 files changed, 15 insertions(+), 6 deletions(-) > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 1549584a1538..3164b6798fe5 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -566,7 +566,7 @@ union rcu_special { > u8 blocked; > u8 need_qs; > u8 exp_hint; /* Hint for performance. */ > - u8 pad; /* No garbage from compiler! */ > + u8 deferred_qs; > } b; /* Bits. */ > u32 s; /* Set of bits. */ > }; > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > index 21611862e083..75110ea75d01 100644 > --- a/kernel/rcu/tree_plugin.h > +++ b/kernel/rcu/tree_plugin.h > @@ -455,6 +455,7 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) > local_irq_restore(flags); > return; > } > + t->rcu_read_unlock_special.b.deferred_qs = false; > if (special.b.need_qs) { > rcu_qs(); > t->rcu_read_unlock_special.b.need_qs = false; > @@ -605,16 +606,24 @@ static void rcu_read_unlock_special(struct task_struct *t) > local_irq_save(flags); > irqs_were_disabled = irqs_disabled_flags(flags); > if (preempt_bh_were_disabled || irqs_were_disabled) { > - WRITE_ONCE(t->rcu_read_unlock_special.b.exp_hint, false); > - /* Need to defer quiescent state until everything is enabled. */ > - if (irqs_were_disabled && use_softirq) { > - /* Enabling irqs does not reschedule, so... */ > + t->rcu_read_unlock_special.b.exp_hint = false; > + // Need to defer quiescent state until everything is enabled. > + if (irqs_were_disabled && use_softirq && > + (in_irq() || !t->rcu_read_unlock_special.b.deferred_qs)) { > + // Using softirq, safe to awaken, and we get > + // no help from enabling irqs, unlike bh/preempt. > raise_softirq_irqoff(RCU_SOFTIRQ); > + } else if (irqs_were_disabled && !use_softirq && > + !t->rcu_read_unlock_special.b.deferred_qs) { > + // Safe to awaken and we get no help from enabling > + // irqs, unlike bh/preempt. > + invoke_rcu_core(); > } else { > - /* Enabling BH or preempt does reschedule, so... */ > + // Enabling BH or preempt does reschedule, so... > set_tsk_need_resched(current); > set_preempt_need_resched(); > } > + t->rcu_read_unlock_special.b.deferred_qs = true; > local_irq_restore(flags); > return; > } That is all quite horrible and begging for sane solution. Would not something like: static void rcu_force_resched(struct irq_work *work) { set_tsk_need_resched(current); set_preempt_need_resched(); } if (irqs_were_disabled) irq_work_queue(&t->irq_work, rcu_force_resched); Solve the whole thing?