On (20/03/24 11:13), Zygo Blaxell wrote: > On Wed, Nov 13, 2019 at 04:16:25PM -0500, Qian Cai wrote: > > From: Sergey Senozhatsky <sergey.senozhatsky.work@xxxxxxxxx> > > > > Sergey didn't like the locking order, > > > > uart_port->lock -> tty_port->lock > > > > uart_write (uart_port->lock) > > __uart_start > > pl011_start_tx > > pl011_tx_chars > > uart_write_wakeup > > tty_port_tty_wakeup > > tty_port_default > > tty_port_tty_get (tty_port->lock) > > > > but those code is so old, and I have no clue how to de-couple it after > > checking other locks in the splat. There is an onging effort to make all > > printk() as deferred, so until that happens, workaround it for now as a > > short-term fix. > > Starting with v5.4.22 I noticed 'dmesg -w' stopped working on some > machines. dmesg will follow console output for a few seconds, then it > stops. strace indicates dmesg is blocked in read() on the /dev/kmsg fd. > If a new dmesg process starts, it gives messages for a few seconds, > then also stops. rsyslog's kernel logging is similarly affected. > > Bisection points to this patch (now known as > 1b710b1b10eff9d46666064ea25f079f70bc67a8 upstream). I can't reproduce > the problem on a test VM, and some machines are running v5.4.22..v5.4.26 > with no dmesg problems. It seems there is some magic in the startup > sequence of affected machines. This code isn't executed after RNG is > seeded, so it would have to get its bad stuff done before that happens. > > Reverting commit 1b710b1b10eff9d46666064ea25f079f70bc67a8 fixes the > dmesg regression on 5.4.26. It might put the original lockdep bug back, > but on machines running stable kernels, I prefer randomly broken lockdep > over repeatably broken dmesg. This should fix the problem https://lore.kernel.org/lkml/20200303113002.63089-1-sergey.senozhatsky@xxxxxxxxx -ss