Re: [RFC PATCH 48/86] rcu: handle quiescent states for PREEMPT_RCU=n

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Paul E. McKenney <paulmck@xxxxxxxxxx> writes:

> On Tue, Nov 28, 2023 at 06:04:33PM +0100, Thomas Gleixner wrote:
>> Paul!
>>
>> On Mon, Nov 20 2023 at 16:38, Paul E. McKenney wrote:
>> > But...
>> >
>> > Suppose we have a long-running loop in the kernel that regularly
>> > enables preemption, but only momentarily.  Then the added
>> > rcu_flavor_sched_clock_irq() check would almost always fail, making
>> > for extremely long grace periods.  Or did I miss a change that causes
>> > preempt_enable() to help RCU out?
>>
>> So first of all this is not any different from today and even with
>> RCU_PREEMPT=y a tight loop:
>>
>>     do {
>>     	preempt_disable();
>>         do_stuff();
>>         preempt_enable();
>>     }
>>
>> will not allow rcu_flavor_sched_clock_irq() to detect QS reliably. All
>> it can do is to force reschedule/preemption after some time, which in
>> turn ends up in a QS.
>
> True, but we don't run RCU_PREEMPT=y on the fleet.  So although this
> argument should offer comfort to those who would like to switch from
> forced preemption to lazy preemption, it doesn't help for those of us
> running NONE/VOLUNTARY.
>
> I can of course compensate if need be by making RCU more aggressive with
> the resched_cpu() hammer, which includes an IPI.  For non-nohz_full CPUs,
> it currently waits halfway to the stall-warning timeout.
>
>> The current NONE/VOLUNTARY models, which imply RCU_PRREMPT=n cannot do
>> that at all because the preempt_enable() is a NOOP and there is no
>> preemption point at return from interrupt to kernel.
>>
>>     do {
>>         do_stuff();
>>     }
>>
>> So the only thing which makes that "work" is slapping a cond_resched()
>> into the loop:
>>
>>     do {
>>         do_stuff();
>>         cond_resched();
>>     }
>
> Yes, exactly.
>
>> But the whole concept behind LAZY is that the loop will always be:
>>
>>     do {
>>     	preempt_disable();
>>         do_stuff();
>>         preempt_enable();
>>     }
>>
>> and the preempt_enable() will always be a functional preemption point.
>
> Understood.  And if preempt_enable() can interact with RCU when requested,
> I would expect that this could make quite a few calls to cond_resched()
> provably unnecessary.  There was some discussion of this:
>
> https://lore.kernel.org/all/0d6a8e80-c89b-4ded-8de1-8c946874f787@paulmck-laptop/
>
> There were objections to an earlier version.  Is this version OK?

Copying that version here for discussion purposes:

        #define preempt_enable() \
        do { \
                barrier(); \
                if (unlikely(preempt_count_dec_and_test())) \
                        __preempt_schedule(); \
                else if (!IS_ENABLED(CONFIG_PREEMPT_RCU) && \
                        (preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK | HARDIRQ_MASK | NMI_MASK) == PREEMPT_OFFSET) && \
                        !irqs_disabled()) \
        ) \
                                rcu_all_qs(); \
        } while (0)

(sched_feat is not exposed outside the scheduler so I'm using the
!CONFIG_PREEMPT_RCU version here.)


I have two-fold objections to this: as PeterZ pointed out, this is
quite a bit heavier than the fairly minimal preempt_enable() -- both
conceptually where the preemption logic now needs to know about when
to check for a specific RCU quiescience state, and in terms of code
size (seems to add about a cacheline worth) to every preempt_enable()
site.

If we end up needing this, is it valid to just optimistically check if
a quiescent state needs to be registered (see below)?
Though this version exposes rcu_data.rcu_urgent_qs outside RCU but maybe
we can encapsulate that in linux/rcupdate.h.

For V1 will go with this simple check in rcu_flavor_sched_clock_irq()
and see where that gets us:

>         if (this_cpu_read(rcu_data.rcu_urgent_qs))
>         	set_need_resched();

---
diff --git a/include/linux/preempt.h b/include/linux/preempt.h
index 9aa6358a1a16..d8139cda8814 100644
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -226,9 +226,11 @@ do { \
 #ifdef CONFIG_PREEMPTION
 #define preempt_enable() \
 do { \
 	barrier(); \
 	if (unlikely(preempt_count_dec_and_test())) \
 		__preempt_schedule(); \
+	else if (unlikely(raw_cpu_read(rcu_data.rcu_urgent_qs))) \
+		rcu_all_qs_check();
 } while (0)

 #define preempt_enable_notrace() \
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 41021080ad25..2ba2743d7ba3 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -887,6 +887,17 @@ void rcu_all_qs(void)
 }
 EXPORT_SYMBOL_GPL(rcu_all_qs);

+void rcu_all_qs_check(void)
+{
+	if (((preempt_count() &
+	      (PREEMPT_MASK | SOFTIRQ_MASK | HARDIRQ_MASK | NMI_MASK)) == PREEMPT_OFFSET) && \
+	      !irqs_disabled())
+
+		  rcu_all_qs();
+}
+EXPORT_SYMBOL_GP(rcu_all_qs);
+
+
 /*
  * Note a PREEMPTION=n context switch. The caller must have disabled interrupts.
  */


--
ankur




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux