On Wed 2018-06-06 14:10:29, Sergey Senozhatsky wrote: > On (06/05/18 14:47), Petr Mladek wrote: > [..] > > Grr, the ABBA deadlock is still there. NMIs are not sent to the other > > CPUs atomically. Even if we detect that logbuf_lock is available > > in printk_nmi_enter() on some CPUs, it might still get locked on > > another CPU before the other CPU gets NMI. > > Can we do something about "B"? :) I mean - any chance we can rework > locking in nmi_cpu_backtrace()? I can't think of any possibility at the moment. Well, it does not mean that it does not exist. The irony is that we need the extra lock in nmi_cpu_backtrace() only because we try to store the messages into the common log buffer. If we always use the per-CPU buffers in NMI then the lock would not cause any problems but it also won't be necessary. > > By other words, any check in printk_safe_enter() is racy and not > > sufficient > > I suppose you meant printk_nmi_enter(). Yup, I am sorry for confusion. > > => I suggest to revert the commit 719f6a7040f1bdaf96fcc70 > > "printk: Use the main logbuf in NMI when logbuf_lock is available" > > for-4.18 and stable until we get a better solution. > > Just random thoughts. > > May be we need to revert it, but let's not "panic". I think [but don't > insist on it] that the patch in question is *probably* "good enough". It > addresses a bug report after all. It was a problem reported by me. I found it when testing other changes. The patch improved the situation definitely. The question is if it is enough in practice. > How often do we have arch_trigger_cpumask_backtrace() on all CPUs these > days? I tend to think that it used to be much more popular in the past, > because we had a loops_per_jiffy based spin_lock lockup detection which > would trigger NMI backtracase, but this functionality has gone, see > bc88c10d7e6900916f5e1ba3829d66a9de92b633 for details. AFAIK, we (SUSE) started to work on the printk/NMI deadlocks because it was too risky to use sysrq-l on systems with many CPUs. It means that the use case is still there. Hmm, the original problem was that any interrupted logbuf_lock owner caused the deadlock. It is _different now_ because the interrupted logbuf_lock owner must block another CPU via the lock in nmi_cpu_backtrace(). I actually wonder why this patch actually helped so much in my test (while true; do echo l >/proc/sysrq-trigger; done). My guess is that it increased a lot the number of CPUs using PRINTK_NMI_CONTEXT (per-CPU buffers). Therefore the deadlock was less likely. On the other hand, the original problem was basically back without this patch. > I'm not saying that the race condition that you found is > unrealistic, I'm just saying that _it seems_ that > nmi_panic()->printk() on a single CPU is more common > now, so having that nmi_printk()->printk_deferred() might be quite > valuable at the end of the day. I see the point. I need to think about it. > May be I'm wrong! > > > The only safe solution seems to be a trylock() in NMI in > > vprintk_emit() and fallback to vprintk_safe() when the lock > > is not taken. I mean something like: > > > > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > > index 247808333ba4..4a5a0bf221b3 100644 > > --- a/kernel/printk/printk.c > > +++ b/kernel/printk/printk.c > > @@ -1845,7 +1845,13 @@ asmlinkage int vprintk_emit(int facility, int level, > > printk_delay(); > > > > /* This stops the holder of console_sem just where we want him */ > > - logbuf_lock_irqsave(flags); > > + printk_safe_enter_irqsave(flags); > > + if (in_nmi() && !raw_spin_trylock(&logbuf_lock)) { > > + vprintk_nmi(fmt, args); > > + printk_safe_exit_irqrestore(flags); > > + return; > > + } else > > + raw_spin_lock(&logbuf_lock); > > /* > > * The printf needs to come first; we need the syslog > > * prefix which might be passed-in as a parameter. > > I need some time to think about it. The bad thing about this solution is that different messages might be handled different way during a single nmi_cpu_backtrace(). The result might be mixed backtraces in the log. > > Sigh, this looks like a material for-4.19. > > Agreed. > > > We might need to revisit if printk_context still makes sense, ... > What do you mean by this? My dream was to select the right vprintk implementation according to printk_context on a single place (vprintk_func()). If this is not usable for NMI, we might need to think about another solution. But this is premature. I believe the printk_context will most likely make still sense. I need to think more about it. Best Regards, Petr