On 2025-02-05, "Paul E. McKenney" <paulmck@xxxxxxxxxx> wrote: >> This is caused by RCU falling behind a callback-flooding kthread that >> invokes call_rcu() in a semi-tight loop. Setting rcutree.kthread_prio=40 >> avoids the splat, but still gets the shutdown-time hang. Retrying with >> the default rcutree.kthread_prio=2 failed to reproduce the splat, but >> it did reproduce the shutdown-time hang. >> >> OK, maybe printk buffers are not being flushed? A 100-millisecond sleep >> at the end of of rcu_torture_cleanup() got all of rcutorture's output >> flushed, but lost the subsequent shutdown-time console traffic. The >> pr_flush(HZ/10,1) seems more sensible, but this is private to printk(). >> >> I would like to log the shutdown-time console traffic because RCU can >> sometimes break things on that path. pr_flush() was changed to private because there were no users. It would not be a problem to make it available. Adding a pr_flush() to rcu_torture_cleanup() would be an appropriate workaround for now (more on this at the end). > There is a call to kmsg_dump(KMSG_DUMP_SHUTDOWN) in kernel_power_off() > that appears to be intended to dump out the printk() buffers, It only dumps the buffers to the registered kmsg_dumpers. It is not responsible for flushing console backlogs. > but it > does not seem to do so in kernels built with CONFIG_PREEMPT_RT=y. > Does there need to be a pr_flush() call prior to the call to > migrate_to_reboot_cpu()? Or maybe even to do_kernel_power_off_prepare() > or kernel_shutdown_prepare()? With CONFIG_PREEMPT_RT=y, legacy consoles only print via a dedicated kthread. Without a pr_flush() somewhere, there is basically no chance that they will get backlogs flushed because noone is waitig for them. The new console API (NBCON) provides support for "atomic consoles", which _do_ flush by transitioning to synchronous printing during shutdown/reboot. Unfortunately we still don't have any NBCON atomic console implemented in the kernel. The 8250 UART will be our first driver, most likely available in 6.15. (With the current PREEMPT_RT patch applied, the 8250 NBCON atomic driver is used.) Since only CONFIG_PREEMPT_RT=y has this issue, I am not sure if we want to sprinkle pr_flush() calls on all sleepable shutdown/reboot paths, although that is certainly one way to handle it. For your case, adding a pr_flush() to rcu_torture_cleanup() and making pr_flush() non-private would be an easy solution to avoid your problem. John Ogness