On Wed 2019-03-06 09:27:13, Mikulas Patocka wrote: > Hi > > I was debugging some kernel lockup with storage drivers and it turned out > that the lockup is caused by the serial console subsystem. If we use > serial console and if we write to it excessively, the kernel sometimes > lockup, sometimes reports rcu stalls and NMI backtraces. Sometimes it will > just print the console messages without donig anything else. This is a very old problem that we have been trying to solve for years. There are two conflicting requirements on printk(): be fast and reliable. The historical solution is that printk() callers store the messages into the log buffer and then just _try_ to take the console lock. The winner who succeeds is responsible for flushing all pending messages to the console. As a result a random victim might get blocked by the console handling for a long time. An obvious solution is offloading the console handling. But it is against the reliability. There are no guarantees that the offload mechanism (kthread, irq) would happen when the system is on their knees. Anyway, which kernel version are you using, please? I wonder if you already have the dbdda842fe96f8932 ("printk: Add console owner and waiter logic to load balance console writes"). It improves the situation a lot. There was a hope that it would be enough in the real life. > This program tests the issue - on framebuffer console, the system is > sluggish, but it is possible to unload the module with rmmod. On serial > console, it locks up to the point that unloading the module is not > possible. Is there any chance to send us logs from the original (real life) problem, please? Best regards, Petr