On Wed, Jun 22, 2022 at 06:35:49PM +0800, Zqiang wrote: > In CONFIG_PREEMPT=n and CONFIG_PREEMPT_COUNT=y kernel, after a exp > grace period begins, if detected current CPU enters idle in > rcu_exp_handler() IPI handler, will immediately report the exp QS of the > current cpu, at this time, maybe not being in an RCU read-side critical > section, but need wait until rcu-softirq or sched-clock irq or sched-switch > occurs on current CPU to check and report exp QS. > >I think the idea is OK, however, this "optimization" is based on the >implementation detail that rcu_read_lock() counts preempt_count when >CONFIG_PREEMPT_COUNT=y, right? It's a little bit dangerous because the >preempt_count when CONFIG_PREEMPT_COUNT=y and CONFIG_PREEMPT=n is mostly >for debugging purposes IIUC, and in other words, _it could be gone_. > Yes, for CONFIG_PREEMPT_COUNT=y and CONFIG_PREEMPT=n kernel The rcu_read_lock/unlock are replaced by preempt_disbale/enable, and the preempt-count is exists, so can report exp QS when not being an RCU read-side critical(preempt_count & (PREEMPT_MASK | SOFTIRQ_MASK )return zero). in IPI handler. For CONFIG_PREEMPT_COUNT=n and CONFIG_PREEMPT=n kernel, The rcu_read_lock/unlock is just barrier(). So I add IS_ENABLED(CONFIG_PREEMPT_COUNT) check in code. Of course, for CONFIG_PREEMPT_COUNT=n kernel, in RCU softirq, the preempt-count is also checked /* Report any deferred quiescent states if preemption enabled. */ if (IS_ENABLED(CONFIG_PREEMPT_COUNT) && (!(preempt_count() & PREEMPT_MASK))) { rcu_preempt_deferred_qs(current); but the RCU softirq may not be triggered in time and reported exp QS, for example a kernel loop exist on NO_HZ_FULL CPU this change, It is to capture the exp QS state earlier and report it. >Also I'm not aware of any but there could be someone assuming that RCU >read-side critical sections can be formed without >rcu_read_{lock,unlock}() in CONFIG_PREEMPT=n kernel. For example, there >might be "creative" code like the following: > > void do_something_only_in_nonpreempt(void) > { > int *p; > > // This function only gets called in PREEMPT=n kernel, > // which means everywhere is a RCU read-side critical > // section, let's save some lines of code. > rcu_read_lock(); > p = rcu_dereference_check(gp, !IS_ENABLED(PREEMPT)); > ... // of course no schedule() here. > <access p> rcu_read_unlock(); > } > Usually access to pointers of type rcu needs to be protected. Any thoughts? >Again, I'm not aware of any existing code that does this but we need to >be sure. > >Regards, >Boqun > > This commit add a exp QS check in rcu_exp_handler(), when not being > in an RCU read-side critical section, report exp QS earlier. > > Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx> > --- > kernel/rcu/tree_exp.h | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h > index be667583a554..34f08267410f 100644 > --- a/kernel/rcu/tree_exp.h > +++ b/kernel/rcu/tree_exp.h > @@ -828,11 +828,14 @@ static void rcu_exp_handler(void *unused) > { > struct rcu_data *rdp = this_cpu_ptr(&rcu_data); > struct rcu_node *rnp = rdp->mynode; > + bool preempt_bh_disabled = > + !!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)); > > if (!(READ_ONCE(rnp->expmask) & rdp->grpmask) || > __this_cpu_read(rcu_data.cpu_no_qs.b.exp)) > return; > - if (rcu_is_cpu_rrupt_from_idle()) { > + if (rcu_is_cpu_rrupt_from_idle() || > + (IS_ENABLED(CONFIG_PREEMPT_COUNT) && !preempt_bh_disabled)) { > rcu_report_exp_rdp(this_cpu_ptr(&rcu_data)); > return; > } > -- > 2.25.1 >