On 22/03/23 15:04, Peter Zijlstra wrote: > On Wed, Mar 22, 2023 at 12:20:28PM +0000, Valentin Schneider wrote: >> On 22/03/23 10:53, Peter Zijlstra wrote: > >> > Hurmph... so we only really consume @func when we IPI. Would it not be >> > more useful to trace this thing for *every* csd enqeued? >> >> It's true that any CSD enqueued on that CPU's call_single_queue in the >> [first CSD llist_add()'ed, IPI IRQ hits] timeframe is a potential source of >> interference. >> >> However, can we be sure that first CSD isn't an indirect cause for the >> following ones? say the target CPU exits RCU EQS due to the IPI, there's a >> bit of time before it gets to flush_smp_call_function_queue() where some other CSD >> could be enqueued *because* of that change in state. >> >> I couldn't find a easy example of that, I might be biased as this is where >> I'd like to go wrt IPI'ing isolated CPUs in usermode. But regardless, when >> correlating an IPI IRQ with its source, we'd always have to look at the >> first CSD in that CSD stack. > > So I was thinking something like this: > > --- > Subject: trace,smp: Trace all smp_function_call*() invocations > From: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > Date: Wed Mar 22 14:58:36 CET 2023 > > (Ab)use the trace_ipi_send_cpu*() family to trace all > smp_function_call*() invocations, not only those that result in an > actual IPI. > > The queued entries log their callback function while the actual IPIs > are traced on generic_smp_call_function_single_interrupt(). > > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > --- > kernel/smp.c | 58 ++++++++++++++++++++++++++++++---------------------------- > 1 file changed, 30 insertions(+), 28 deletions(-) > > --- a/kernel/smp.c > +++ b/kernel/smp.c > @@ -106,18 +106,20 @@ void __init call_function_init(void) > } > > static __always_inline void > -send_call_function_single_ipi(int cpu, smp_call_func_t func) > +send_call_function_single_ipi(int cpu) > { > if (call_function_single_prep_ipi(cpu)) { > - trace_ipi_send_cpu(cpu, _RET_IP_, func); > + trace_ipi_send_cpu(cpu, _RET_IP_, > + generic_smp_call_function_single_interrupt); Hm, this does get rid of the func being passed down the helpers, but this means the trace events are now stateful, i.e. I need the first and last events in a CSD stack to figure out which one actually caused the IPI. It also requires whoever is looking at the trace to be aware of which IPIs are attached to a CSD, and which ones aren't. ATM that's only the resched IPI, but per the cover letter there's more to come (e.g. tick_broadcast() for arm64/riscv and a few others). For instance: hackbench-157 [001] 10.894320: ipi_send_cpu: cpu=3 callsite=check_preempt_curr+0x37 callback=0x0 hackbench-157 [001] 10.895068: ipi_send_cpu: cpu=3 callsite=try_to_wake_up+0x29e callback=sched_ttwu_pending+0x0 hackbench-157 [001] 10.895068: ipi_send_cpu: cpu=3 callsite=try_to_wake_up+0x29e callback=generic_smp_call_function_single_interrupt+0x0 That first one sent a RESCHEDULE IPI, the second one a CALL_FUNCTION one, but you really have to know what you're looking at... Are you worried about the @func being pushed down? Staring at x86 asm is not good for the soul, but AFAICT this does cause an extra register to be popped in the prologue because all of the helpers are __always_inline, so both paths of the static key(s) are in the same stackframe. I can "improve" this with: --- diff --git a/kernel/smp.c b/kernel/smp.c index 5cd680a7e78ef..55f120dae1713 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -511,6 +511,26 @@ raw_smp_call_single_queue(int cpu, struct llist_node *node, smp_call_func_t func static DEFINE_PER_CPU_SHARED_ALIGNED(call_single_data_t, csd_data); +static noinline void __smp_call_single_queue_trace(int cpu, struct llist_node *node) +{ + call_single_data_t *csd; + smp_call_func_t func; + + + /* + * We have to check the type of the CSD before queueing it, because + * once queued it can have its flags cleared by + * flush_smp_call_function_queue() + * even if we haven't sent the smp_call IPI yet (e.g. the stopper + * executes migration_cpu_stop() on the remote CPU). + */ + csd = container_of(node, call_single_data_t, node.llist); + func = CSD_TYPE(csd) == CSD_TYPE_TTWU ? + sched_ttwu_pending : csd->func; + + raw_smp_call_single_queue(cpu, node, func); +} + void __smp_call_single_queue(int cpu, struct llist_node *node) { #ifdef CONFIG_CSD_LOCK_WAIT_DEBUG @@ -525,25 +545,10 @@ void __smp_call_single_queue(int cpu, struct llist_node *node) } } #endif - /* - * We have to check the type of the CSD before queueing it, because - * once queued it can have its flags cleared by - * flush_smp_call_function_queue() - * even if we haven't sent the smp_call IPI yet (e.g. the stopper - * executes migration_cpu_stop() on the remote CPU). - */ - if (trace_ipi_send_cpumask_enabled()) { - call_single_data_t *csd; - smp_call_func_t func; - - csd = container_of(node, call_single_data_t, node.llist); - func = CSD_TYPE(csd) == CSD_TYPE_TTWU ? - sched_ttwu_pending : csd->func; - - raw_smp_call_single_queue(cpu, node, func); - } else { + if (trace_ipi_send_cpumask_enabled()) + __smp_call_single_queue_trace(cpu, node); + else raw_smp_call_single_queue(cpu, node, NULL); - } } /*