On Fri, Jul 22, 2022 at 08:52:13AM +0800, Zqiang wrote: > In PREEMPT_RCU kernel, for multi-node rcu tree, if the RCU read > critical section is preempted, the current task are only queued > leaf rcu_node blkd list, for single-node rcu tree, the root node > is also leaf node. > > This commit add rcu_is_leaf_node() to filter out checks for non-leaf > rcu_node. > > Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx> > >First, thank you for digging into this! > > --- > kernel/rcu/tree_plugin.h | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > index b2219577fbe2..a9df11ec65af 100644 > --- a/kernel/rcu/tree_plugin.h > +++ b/kernel/rcu/tree_plugin.h > @@ -693,6 +693,8 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) > > RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_check_blocked_tasks() invoked with preemption enabled!!!\n"); > raw_lockdep_assert_held_rcu_node(rnp); > + if (!rcu_is_leaf_node(rnp)) > + goto end; > if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp))) > dump_blkd_tasks(rnp, 10); > if (rcu_preempt_has_tasks(rnp) && >So you are adding a simple check for all rcu_node structures >that removes two simple checks for non-leaf rcu_node structures. > >Assuming that the costs of all these checks is roughly the same (but >please feel free to actually measure them on real hardware), what what >fraction of the rcu_node structures must be non-leaf for this change to >be a win? Then for what configurations using default fanouts is this >fraction exceeded? > >The default fanout is 16 from non-leaf to leaf and 64 from non-leaf >to non-leaf (32 for non-leaf to non-leaf on 32-bit systems). Thanks for reminding, I only considered the logic in the code, not the actual real usage scenarios. I will do some tests on the actual hardware. 😊 Thanks Zqiang > >Hey, you wanted some algebra practice anyway. ;-) > @@ -703,6 +705,7 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp) > trace_rcu_unlock_preempted_task(TPS("rcu_preempt-GPS"), > rnp->gp_seq, t->pid); > } > +end: > WARN_ON_ONCE(rnp->qsmask); > } > > @@ -1178,7 +1181,8 @@ static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags) > */ > static void rcu_preempt_boost_start_gp(struct rcu_node *rnp) > { > - rnp->boost_time = jiffies + RCU_BOOST_DELAY_JIFFIES; > + if (rcu_is_leaf_node(rnp)) > + rnp->boost_time = jiffies + RCU_BOOST_DELAY_JIFFIES; > >And here you are adding a check on all nodes to eliminate an addition >and a store on non-leaf nodes. Same questions as above, with the same >invitation to actually measure the costs of these operations. > > Thanx, Paul > > } > > /* > -- > 2.25.1 >