Re: [PATCH v5 1/4] printk/nmi: generic solution for safe printk in NMI

Steven Rostedt <rostedt@xxxxxxxxxxx> · Thu, 27 Apr 2017 11:42:48 -0400

On Thu, 27 Apr 2017 17:28:07 +0200
Petr Mladek <pmladek@xxxxxxxx> wrote:

> > When I get a chance, I'll see if I can insert a trigger to crash the
> > kernel from NMI on another box and see if this patch helps.  
> 
> I actually tested it here using this hack:
> 
> diff --cc lib/nmi_backtrace.c
> index d531f85c0c9b,0bc0a3535a8a..000000000000
> --- a/lib/nmi_backtrace.c
> +++ b/lib/nmi_backtrace.c
> @@@ -89,8 -90,7 +90,9 @@@ bool nmi_cpu_backtrace(struct pt_regs *
>         int cpu = smp_processor_id();
>   
>         if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
>  +              if (in_nmi())
>  +                      panic("Simulating panic in NMI\n");
> +               arch_spin_lock(&lock);

I was going to create a ftrace trigger, to crash on demand, but this
may do as well.

>                 if (regs && cpu_in_idle(instruction_pointer(regs))) {
>                         pr_warn("NMI backtrace for cpu %d skipped: idling at pc %#lx\n",
>                                 cpu, instruction_pointer(regs));
> 
> and triggered by:
> 
>    echo  l > /proc/sysrq-trigger
> 
> The patch really helped to see much more (all) messages from the ftrace
> buffers in NMI mode.
> 
> But the test is a bit artifical. The patch might not help when there
> is a big printk() activity on the system when the panic() is
> triggered. We might wrongly use the small per-CPU buffer when
> the logbuf_lock is tested and taken on another CPU at the same time.
> It means that it will not always help.
> 
> I personally think that the patch might be good enough. I am not sure
> if a perfect (more comlpex) solution is worth it.

I wasn't asking for perfect, as the previous solutions never were
either. I just want an optimistic dump if possible.

I'll try to get some time today to test this, and let you know. But it
wont be on the machine that I originally had the issue with.

Thanks,

-- Steve