On (03/12/19 09:17), John Ogness wrote: > > wait M times (N - 1). Sounds quadratic. > > If these are critical messages, then we are _not allowed to drop any_! > For critical messages printk must be synchronous. Thus for critical > messages the situation you illustrated is appropriate. > > > 40) goto 10 > > > > So I have some doubts regarding some of assumptions behind new printk > > design. And the problem is not in prb_lock() unfairness. Current > > printk design does look to me SMP-friendly; yes, it has unbound > > printing loop; that can be addressed. > > Let us not forget, it deadlocked the machine. That's the reason this > thread exists. It didn't deadlock the machine. It was a typical soft lockup. Printing CPU loop-ed in console_unlock() with preemption disabled; soft lockup hrtimer was running on that CPU, but due to disabled preemption around console_unlock() soft lockup's per-CPU kthread could not get scheduled and could not update per-CPU touch_ts. Soft lockup hrtimer detected it: [ 5128.552442] watchdog: BUG: soft lockup - CPU#9 stuck for 23s! [kworker/9:53:4131] Along with that RCU was not able to get scheduled. Which was detected by RCU stall detector: [ 4891.199009] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 4891.221308] device-mapper: integrity: Checksum failed at sector 0x118d4f [ 4891.251366] rcu: 9-....: (1923 ticks this GP) idle=7fa/1/0x4000000000000002 softirq=2190/2190 fqs=15013 [ 4891.251367] rcu: (detected by 16, t=60054 jiffies, g=24641, q=351) [ 4891.311941] Sending NMI from CPU 16 to CPUs 9: [..] > 2. You seem unwilling to acknowledge the difference between emergency > and informational messages. A message is either critical or it is > not. If it is, it should be handled as such, regardless of > interference, regardless if it means turning an SMP machine into a UP > machine. If it is not critical, it should be sent along a > non-interfering path so the the system is _not_ affected. OK. Let's move on then. -ss