Paul E. McKenney <paulmck@xxxxxxxxxx> writes: > On Tue, Nov 28, 2023 at 06:04:33PM +0100, Thomas Gleixner wrote: >> Paul! >> >> On Mon, Nov 20 2023 at 16:38, Paul E. McKenney wrote: >> > But... >> > >> > Suppose we have a long-running loop in the kernel that regularly >> > enables preemption, but only momentarily. Then the added >> > rcu_flavor_sched_clock_irq() check would almost always fail, making >> > for extremely long grace periods. Or did I miss a change that causes >> > preempt_enable() to help RCU out? >> >> So first of all this is not any different from today and even with >> RCU_PREEMPT=y a tight loop: >> >> do { >> preempt_disable(); >> do_stuff(); >> preempt_enable(); >> } >> >> will not allow rcu_flavor_sched_clock_irq() to detect QS reliably. All >> it can do is to force reschedule/preemption after some time, which in >> turn ends up in a QS. > > True, but we don't run RCU_PREEMPT=y on the fleet. So although this > argument should offer comfort to those who would like to switch from > forced preemption to lazy preemption, it doesn't help for those of us > running NONE/VOLUNTARY. > > I can of course compensate if need be by making RCU more aggressive with > the resched_cpu() hammer, which includes an IPI. For non-nohz_full CPUs, > it currently waits halfway to the stall-warning timeout. > >> The current NONE/VOLUNTARY models, which imply RCU_PRREMPT=n cannot do >> that at all because the preempt_enable() is a NOOP and there is no >> preemption point at return from interrupt to kernel. >> >> do { >> do_stuff(); >> } >> >> So the only thing which makes that "work" is slapping a cond_resched() >> into the loop: >> >> do { >> do_stuff(); >> cond_resched(); >> } > > Yes, exactly. > >> But the whole concept behind LAZY is that the loop will always be: >> >> do { >> preempt_disable(); >> do_stuff(); >> preempt_enable(); >> } >> >> and the preempt_enable() will always be a functional preemption point. > > Understood. And if preempt_enable() can interact with RCU when requested, > I would expect that this could make quite a few calls to cond_resched() > provably unnecessary. There was some discussion of this: > > https://lore.kernel.org/all/0d6a8e80-c89b-4ded-8de1-8c946874f787@paulmck-laptop/ > > There were objections to an earlier version. Is this version OK? Copying that version here for discussion purposes: #define preempt_enable() \ do { \ barrier(); \ if (unlikely(preempt_count_dec_and_test())) \ __preempt_schedule(); \ else if (!IS_ENABLED(CONFIG_PREEMPT_RCU) && \ (preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK | HARDIRQ_MASK | NMI_MASK) == PREEMPT_OFFSET) && \ !irqs_disabled()) \ ) \ rcu_all_qs(); \ } while (0) (sched_feat is not exposed outside the scheduler so I'm using the !CONFIG_PREEMPT_RCU version here.) I have two-fold objections to this: as PeterZ pointed out, this is quite a bit heavier than the fairly minimal preempt_enable() -- both conceptually where the preemption logic now needs to know about when to check for a specific RCU quiescience state, and in terms of code size (seems to add about a cacheline worth) to every preempt_enable() site. If we end up needing this, is it valid to just optimistically check if a quiescent state needs to be registered (see below)? Though this version exposes rcu_data.rcu_urgent_qs outside RCU but maybe we can encapsulate that in linux/rcupdate.h. For V1 will go with this simple check in rcu_flavor_sched_clock_irq() and see where that gets us: > if (this_cpu_read(rcu_data.rcu_urgent_qs)) > set_need_resched(); --- diff --git a/include/linux/preempt.h b/include/linux/preempt.h index 9aa6358a1a16..d8139cda8814 100644 --- a/include/linux/preempt.h +++ b/include/linux/preempt.h @@ -226,9 +226,11 @@ do { \ #ifdef CONFIG_PREEMPTION #define preempt_enable() \ do { \ barrier(); \ if (unlikely(preempt_count_dec_and_test())) \ __preempt_schedule(); \ + else if (unlikely(raw_cpu_read(rcu_data.rcu_urgent_qs))) \ + rcu_all_qs_check(); } while (0) #define preempt_enable_notrace() \ diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 41021080ad25..2ba2743d7ba3 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -887,6 +887,17 @@ void rcu_all_qs(void) } EXPORT_SYMBOL_GPL(rcu_all_qs); +void rcu_all_qs_check(void) +{ + if (((preempt_count() & + (PREEMPT_MASK | SOFTIRQ_MASK | HARDIRQ_MASK | NMI_MASK)) == PREEMPT_OFFSET) && \ + !irqs_disabled()) + + rcu_all_qs(); +} +EXPORT_SYMBOL_GP(rcu_all_qs); + + /* * Note a PREEMPTION=n context switch. The caller must have disabled interrupts. */ -- ankur