On Fri, Sep 16, 2022 at 10:19:14PM +0000, Joel Fernandes wrote: [...] > >> + if (rcu_nocb_try_bypass(rdp, head, &was_alldone, flags)) > >> + return; // Enqueued onto ->nocb_bypass, so just leave. > >> + // If no-CBs CPU gets here, rcu_nocb_try_bypass() acquired > >> ->nocb_lock. > >> + rcu_segcblist_enqueue(&rdp->cblist, head); > >> + > >> trace_rcu_segcb_stats(&rdp->cblist, TPS("SegCBQueued")); > >> > >> /* Go handle any RCU core processing required. */ > > > > Two subtle changes induced here: > > > > * rcu_segcblist_n_cbs() is now read lockless. It's just tracing so no huge > > deal > > but still, if this races with callbacks invocation, we may on some rare > > occasion > > read stale numbers on traces while enqueuing (think about rcu_top for > > example) > > Actually I disagree with this point now. Changes to the number of callbacks > in the main ->cblist can be lockless. It uses atomic on CONFIG_RCU_NOCB. On > non CONFIG_RCU_NOCB, CB cannot be queued as interrupts will be disabled. > > Also, in rcu_do_batch(), the count is manipulated after calling > rcu_nocb_lock_irqsave(rdp, flags); > > > * trace_rcu_callback() will now show the number of callbacks _before_ > > enqueuing > > instead of _after_. Not sure if it matters, but sometimes tools rely on > > trace > > events. > > Yeah this is fixable and good point. So how about the below? > > ---8<----------------------- > > From: "Joel Fernandes (Google)" <joel@xxxxxxxxxxxxxxxxx> > Subject: [PATCH] rcu: Call trace_rcu_callback() also for bypass queuing > > If any CB is queued into the bypass list, then trace_rcu_callback() does > not show it. This makes it not clear when a callback was actually > queued, as you only end up getting a trace_rcu_invoke_callback() trace. > Fix it by calling the tracing function even for bypass queue. > > Also, while at it, optimize the tracing so that rcu_state is not > accessed here if tracing is disabled, because that's useless if we are > not tracing. A quick inspection of the generated assembler shows that > rcu_state is accessed even if the jump label for the tracepoint is > disabled. > > __trace_rcu_callback: > movq 8(%rdi), %rcx > movq rcu_state+3640(%rip), %rax > movq %rdi, %rdx > cmpq $4095, %rcx > ja .L3100 > movq 192(%rsi), %r8 > 1:jmp .L3101 # objtool NOPs this > .pushsection __jump_table, "aw" > .balign 8 > .long 1b - . > .long .L3101 - . > .quad __tracepoint_rcu_kvfree_callback+8 + 2 - . > .popsection > > With this change, the jump label check which is NOOPed is moved to the > beginning of the function. > > Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> > --- > kernel/rcu/tree.c | 31 +++++++++++++++++++++++-------- > 1 file changed, 23 insertions(+), 8 deletions(-) > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 5ec97e3f7468..b64df55f7f55 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -2728,6 +2728,23 @@ static void check_cb_ovld(struct rcu_data *rdp) > raw_spin_unlock_rcu_node(rnp); > } > > +/* > + * Trace RCU callback helper, call after enqueuing callback. > + * The ->cblist must be locked when called. Also sorry for the spam, this comment is stale now so will delete this line.