Re: [PATCH] tracing/ftrace: Fix the potential hang on MIPS SMP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

This is not applicable, a revision with the new flag
machine_stop_pending instead of machine_stopped will be sent out
tomorrow.

Regards,
Wu Zhangjin

On 8/24/10, Wu Zhangjin <wuzhangjin@xxxxxxxxx> wrote:
> From: Wu Zhangjin <zhangjin.wu@xxxxxxxxxxxxx>
>
> In Ftrace, we need to flush the icache after code modification to ensure
> the instructions will be executed are exactly what we want.
>
> And for the following reason(arch/x86/kernel/ftrace.c):
>
>  * Modifying code must take extra care. On an SMP machine, if
>  * the code being modified is also being executed on another CPU
>  * that CPU will have undefined results and possibly take a GPF.
>  * We use kstop_machine to stop other CPUS from exectuing code.
>
> In SMP, the code modification are protected by stop_machine(), which
> will disables the irqs of all cpus and then modify the code, flush the
> icache.
>
> In MIPS SMP, to tell the other cpus to flush their related icache, the
> IPI interrupt must be sent to them and wait for them exiting from the
> icache flushing, but for the stop_machine() have disabled interrupts, it
> will need to wait for the other cpus all the time, then deadlock ->
> hang.
>
> (Note: cavium is an exception, benefit from its synci instruction, it
> doesn't call smp_call_function() to execute the icache flushing
> operation but send the ICACHE_FLUSH ipi to the other cpus directly, so,
> no wait and no hang on cavium, and after the irqs of the cpus are
> enabled, the pending icache flush interrupt will be filed and synci will
> flush the icache on every cpu respectively, so, no cache problem).
>
> To break this deadlock, the key is: stop calling flush_icache_range() in
> stop_machine() but delay it after stop_machine(). delaying the icache
> flushing operation doesn't influence the tracing results even if the
> other cpus execute the code just modified before the icache flushing,
> for the kernel tracing will only be enabled after users issuing:
>
> $ echo 1 > /path/to/tracing/tracing_enabled
>
> Thanks to the weak functions: ftrace_arch_code_modify_prepare() and
> ftrace_arch_code_modify_post_process(). they are called by
> ftrace_run_update_code() before and after stop_machine() respectively,
> with them, ftrace_modify_code() can check whether it is called in
> stop_machine() and if called in stop_machine(), then delay the operation
> of icache flushing, as a result, the deadlock is broken.
>
> Without this patch, Ftrace for RMI XLS will hang after issuing the
> following command:
>
> $ echo function > /path/to/tracing/current_tracer
>
> Exactly, it hangs on kernel/smp.c:
>
> void smp_call_function_many(const struct cpumask *mask,
> {
> 	[snip]
> 	/*
> 	 * Can deadlock when called with interrupts disabled.
> 	 * We allow cpu's that are not yet online though, as no one else can
> 	 * send smp call function interrupt to this cpu and as such deadlocks
> 	 * can't happen.
> 	 */
> 	WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
> 		     && !oops_in_progress);
> 	[snip]
> 	csd_lock(&data->csd);
>
> 	[snip]
> 	/* Send a message to all CPUs in the map */
> 	arch_send_call_function_ipi_mask(data->cpumask);
>
> 	/* Optionally wait for the CPUs to complete */
> 	if (wait)
> 		csd_lock_wait(&data->csd);	--> hang here
> }
>
> Signed-off-by: Wu Zhangjin <wuzhangjin@xxxxxxxxx>
> ---
>  arch/mips/kernel/ftrace.c |   22 +++++++++++++++++++++-
>  1 files changed, 21 insertions(+), 1 deletions(-)
>
> diff --git a/arch/mips/kernel/ftrace.c b/arch/mips/kernel/ftrace.c
> index 5a84a1f..c8ebb13 100644
> --- a/arch/mips/kernel/ftrace.c
> +++ b/arch/mips/kernel/ftrace.c
> @@ -69,6 +69,23 @@ static inline void ftrace_dyn_arch_init_insns(void)
>  #endif
>  }
>
> +#ifdef CONFIG_SMP
> +static int machine_stopped __read_mostly;
> +
> +int ftrace_arch_code_modify_prepare(void)
> +{
> +	machine_stopped = 1;
> +	return 0;
> +}
> +
> +int ftrace_arch_code_modify_post_process(void)
> +{
> +	__flush_cache_all();
> +	machine_stopped = 0;
> +	return 0;
> +}
> +#endif
> +
>  static int ftrace_modify_code(unsigned long ip, unsigned int new_code)
>  {
>  	int faulted;
> @@ -79,7 +96,10 @@ static int ftrace_modify_code(unsigned long ip, unsigned
> int new_code)
>  	if (unlikely(faulted))
>  		return -EFAULT;
>
> -	flush_icache_range(ip, ip + 8);
> +#ifdef CONFIG_SMP
> +	if (!machine_stopped)
> +#endif
> +		flush_icache_range(ip, ip + 8);
>
>  	return 0;
>  }
> --
> 1.7.0.4
>
>
>



[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux