On Mon, May 6, 2019 at 5:10 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > But the CPU that was rewriting instructions does a run_sync() after > removing the int3: > > static void run_sync(void) > { > int enable_irqs; > > /* No need to sync if there's only one CPU */ > if (num_online_cpus() == 1) > return; > > enable_irqs = irqs_disabled(); > > /* We may be called with interrupts disabled (on bootup). */ > if (enable_irqs) > local_irq_enable(); > on_each_cpu(do_sync_core, NULL, 1); > if (enable_irqs) > local_irq_disable(); > } > > Which sends an IPI to all CPUs to make sure they no longer see the int3. Duh. I have been looking back and forth in that file, and I was mixing ftrace_modify_code_direct() (which only does a local sync) with ftrace_modify_code() (which does run_sync()). The dangers of moving around by searching for function names. That file is a maze of several functions that are very similarly named and do slightly different things. But yes, I was looking at the "direct" sequence. > I think you are missing the run_sync() which is the heavy hammer to > make sure all CPUs are in sync. And this is done at each stage: > > add int3 > run_sync(); > update call cite outside of int3 > run_sync() > remove int3 > run_sync() > > HPA said that the last run_sync() isn't needed, but I kept it because I > wanted to make sure. Looks like your analysis shows that it is needed. Absolutely. I think we could get rid of it, but yes, to then avoid the race we'd need to be a lot more clever. Yeah, with the three run_sunc() things, the races I thought it had can't happen. Linus