Note, there shouldn't be a "(v2)" outside the "[PATCH ]" part. Otherwise it gets added into the git commit during "git am". On Thu, 5 Aug 2021 09:27:16 -0400 Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote: > On a 1->0->1 callbacks transition, there is an issue with the new > callback using the old callback's data. > > Considering __DO_TRACE_CALL: > > do { \ > struct tracepoint_func *it_func_ptr; \ > void *__data; \ > it_func_ptr = \ > rcu_dereference_raw((&__tracepoint_##name)->funcs); \ > if (it_func_ptr) { \ > __data = (it_func_ptr)->data; \ > > ----> [ delayed here on one CPU (e.g. vcpu preempted by the host) ] > > static_call(tp_func_##name)(__data, args); \ > } \ > } while (0) > > It has loaded the tp->funcs of the old callback, so it will try to use the old > data. This can be fixed by adding a RCU sync anywhere in the 1->0->1 > transition chain. > > On a N->2->1 transition, we need an rcu-sync because you may have a > sequence of 3->2->1 (or 1->2->1) where the element 0 data is unchanged > between 2->1, but was changed from 3->2 (or from 1->2), which may be > observed by the static call. This can be fixed by adding an > unconditional RCU sync in transition 2->1. > > A follow up fix will introduce a more lightweight scheme based on RCU > get_state and cond_sync. I'll add here that this patch will cause a huge performance regression on disabling the trace events, but the follow up patch will fix that. Before this patch: # trace-cmd start -e all # time trace-cmd start -p nop real 0m0.778s user 0m0.000s sys 0m0.061s After this patch: # trace-cmd start -e all # time trace-cmd start -p nop real 0m10.593s user 0m0.017s sys 0m0.259s That's more than 10x slow down. Just under a second to disable all events now goes to over 10 seconds! But after the next patch: # trace-cmd start -e all # time trace-cmd start -p nop real 0m0.878s user 0m0.000s sys 0m0.103s Which is in the noise from before this patch. This is a big enough regression, I'll even add a Fixes tag to the next patch on the final sha1 of this patch! Such that this patch won't be backported without the next patch. > > Link: https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@xxxxxxxxx/ > Fixes: d25e37d89dd2 ("tracepoint: Optimize using static_call()") For this patch, I would say the above is what this fixes. -- Steve > Fixes: 547305a64632 ("tracepoint: Fix out of sync data passing by static caller") > Fixes: 352384d5c84e ("tracepoints: Update static_call before tp_funcs when adding a tracepoint") > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> > Cc: Steven Rostedt <rostedt@xxxxxxxxxxx> > Cc: Ingo Molnar <mingo@xxxxxxxxxx> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx> > Cc: Stefan Metzmacher <metze@xxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> # 5.10+ > ---