Hi, This is not applicable, a revision with the new flag machine_stop_pending instead of machine_stopped will be sent out tomorrow. Regards, Wu Zhangjin On 8/24/10, Wu Zhangjin <wuzhangjin@xxxxxxxxx> wrote: > From: Wu Zhangjin <zhangjin.wu@xxxxxxxxxxxxx> > > In Ftrace, we need to flush the icache after code modification to ensure > the instructions will be executed are exactly what we want. > > And for the following reason(arch/x86/kernel/ftrace.c): > > * Modifying code must take extra care. On an SMP machine, if > * the code being modified is also being executed on another CPU > * that CPU will have undefined results and possibly take a GPF. > * We use kstop_machine to stop other CPUS from exectuing code. > > In SMP, the code modification are protected by stop_machine(), which > will disables the irqs of all cpus and then modify the code, flush the > icache. > > In MIPS SMP, to tell the other cpus to flush their related icache, the > IPI interrupt must be sent to them and wait for them exiting from the > icache flushing, but for the stop_machine() have disabled interrupts, it > will need to wait for the other cpus all the time, then deadlock -> > hang. > > (Note: cavium is an exception, benefit from its synci instruction, it > doesn't call smp_call_function() to execute the icache flushing > operation but send the ICACHE_FLUSH ipi to the other cpus directly, so, > no wait and no hang on cavium, and after the irqs of the cpus are > enabled, the pending icache flush interrupt will be filed and synci will > flush the icache on every cpu respectively, so, no cache problem). > > To break this deadlock, the key is: stop calling flush_icache_range() in > stop_machine() but delay it after stop_machine(). delaying the icache > flushing operation doesn't influence the tracing results even if the > other cpus execute the code just modified before the icache flushing, > for the kernel tracing will only be enabled after users issuing: > > $ echo 1 > /path/to/tracing/tracing_enabled > > Thanks to the weak functions: ftrace_arch_code_modify_prepare() and > ftrace_arch_code_modify_post_process(). they are called by > ftrace_run_update_code() before and after stop_machine() respectively, > with them, ftrace_modify_code() can check whether it is called in > stop_machine() and if called in stop_machine(), then delay the operation > of icache flushing, as a result, the deadlock is broken. > > Without this patch, Ftrace for RMI XLS will hang after issuing the > following command: > > $ echo function > /path/to/tracing/current_tracer > > Exactly, it hangs on kernel/smp.c: > > void smp_call_function_many(const struct cpumask *mask, > { > [snip] > /* > * Can deadlock when called with interrupts disabled. > * We allow cpu's that are not yet online though, as no one else can > * send smp call function interrupt to this cpu and as such deadlocks > * can't happen. > */ > WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() > && !oops_in_progress); > [snip] > csd_lock(&data->csd); > > [snip] > /* Send a message to all CPUs in the map */ > arch_send_call_function_ipi_mask(data->cpumask); > > /* Optionally wait for the CPUs to complete */ > if (wait) > csd_lock_wait(&data->csd); --> hang here > } > > Signed-off-by: Wu Zhangjin <wuzhangjin@xxxxxxxxx> > --- > arch/mips/kernel/ftrace.c | 22 +++++++++++++++++++++- > 1 files changed, 21 insertions(+), 1 deletions(-) > > diff --git a/arch/mips/kernel/ftrace.c b/arch/mips/kernel/ftrace.c > index 5a84a1f..c8ebb13 100644 > --- a/arch/mips/kernel/ftrace.c > +++ b/arch/mips/kernel/ftrace.c > @@ -69,6 +69,23 @@ static inline void ftrace_dyn_arch_init_insns(void) > #endif > } > > +#ifdef CONFIG_SMP > +static int machine_stopped __read_mostly; > + > +int ftrace_arch_code_modify_prepare(void) > +{ > + machine_stopped = 1; > + return 0; > +} > + > +int ftrace_arch_code_modify_post_process(void) > +{ > + __flush_cache_all(); > + machine_stopped = 0; > + return 0; > +} > +#endif > + > static int ftrace_modify_code(unsigned long ip, unsigned int new_code) > { > int faulted; > @@ -79,7 +96,10 @@ static int ftrace_modify_code(unsigned long ip, unsigned > int new_code) > if (unlikely(faulted)) > return -EFAULT; > > - flush_icache_range(ip, ip + 8); > +#ifdef CONFIG_SMP > + if (!machine_stopped) > +#endif > + flush_icache_range(ip, ip + 8); > > return 0; > } > -- > 1.7.0.4 > > >