Re: [BUG] workqueues and printk not playing nice since next-20240130

Petr Mladek <pmladek@xxxxxxxx> · Fri, 2 Feb 2024 15:35:26 +0100

On Fri 2024-02-02 05:04:45, Paul E. McKenney wrote:
> Hello!
> 
> Starting with next-20240130 (and perhaps a bit earlier), rcutorture gets
> what initially looked like early-boot hangs, but only when running on
> dual-socket x86 systems [1], as it it works just fine on my x86 laptop [2].
> But when running on dual-socket systems, this happens all the time,
> perhaps because rcutorture works hard to split each guest OS across a
> socket boundary.
> 
> This is the reproducer:
> 
> tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 1m --configs "10*TREE01" --trust-make
> 
> By "looked like early-boot hangs" I mean that qemu was quite happy,
> but there was absolutely no console output.
> 
> Bisection identified this commit:
> 
> 5797b1c18919 ("workqueue: Implement system-wide nr_active enforcement for unbound workqueues")
> 
> Reverting this commit made the problem go away.  Except that it is really
> hard to imagine this commit having any effect whatsoever on early boot
> execution.  Of course, this might be a failure of imagination on my part,
> so I enlisted the aid of gdb:
> 
> tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 1m --configs "TREE01" --trust-make --gdb
> 
> After following the resulting gdb startup instructions and waiting for
> about ten seconds, I hit control-C on the gdb window and then:
> 
> 	(gdb) bt
> 	#0  default_idle () at arch/x86/kernel/process.c:743
> 	#1  0xffffffff81e94d34 in default_idle_call () at kernel/sched/idle.c:97
> 	#2  0xffffffff810d746d in cpuidle_idle_call () at kernel/sched/idle.c:170
> 	#3  do_idle () at kernel/sched/idle.c:312
> 	#4  0xffffffff810d76a4 in cpu_startup_entry (state=state@entry=CPUHP_ONLINE)
> 	    at kernel/sched/idle.c:410
> 	#5  0xffffffff81e95417 in rest_init () at init/main.c:730
> 	#6  0xffffffff8329adf2 in start_kernel () at init/main.c:1067
> 	#7  0xffffffff832a5038 in x86_64_start_reservations (
> 	    real_mode_data=real_mode_data@entry=0x13d50 <exception_stacks+32080> <error: Cannot access memory at address 0x13d50>) at arch/x86/kernel/head64.c:555
> 	#8  0xffffffff832a513c in x86_64_start_kernel (
> 	    real_mode_data=0x13d50 <exception_stacks+32080> <error: Cannot access memory at address 0x13d50>) at arch/x86/kernel/head64.c:536
> 	#9  0xffffffff810001d2 in secondary_startup_64 ()
> 	    at arch/x86/kernel/head_64.S:461
> 	#10 0x0000000000000000 in ?? ()
> 	(gdb) print jiffies
> 	$1 = 4294676330
> 	(gdb) print system_state
> 	$2 = SYSTEM_RUNNING
> 
> In other words, the system really has booted, and at least one CPU is
> happily idling in the idle loop.  And another CPU is (maybe not quite
> so happily) running rcutorture:
> 
> 	(gdb) thread 6
> 	[Switching to thread 6 (Thread 1.6)]
> 	#0  0xffffffff8111160b in rcu_torture_one_read (
> 	    trsp=trsp@entry=0xffffc900004abe90, myid=myid@entry=4)
> 	    at kernel/rcu/rcutorture.c:2003
> 	2003            completed = cur_ops->get_gp_seq();
> 	(gdb) bt
> 	#0  0xffffffff8111160b in rcu_torture_one_read (
> 	    trsp=trsp@entry=0xffffc900004abe90, myid=myid@entry=4)
> 	    at kernel/rcu/rcutorture.c:2003
> 	#1  0xffffffff81111bef in rcu_torture_reader (arg=0x4 <fixed_percpu_data+4>)
> 	    at kernel/rcu/rcutorture.c:2097
> 	#2  0xffffffff810af3e0 in kthread (_create=0xffff8880047aa480)
> 	    at kernel/kthread.c:388
> 	#3  0xffffffff8103af1f in ret_from_fork (prev=<optimized out>,
> 	    regs=0xffffc900004abf58, fn=0xffffffff810af300 <kthread>,
> 	    fn_arg=0xffff8880047aa480) at arch/x86/kernel/process.c:147
> 	#4  0xffffffff8100247a in ret_from_fork_asm () at arch/x86/entry/entry_64.S:242
> 	#5  0x0000000000000000 in ?? ()
> 
> So the system really did boot and is running just fine.  It is just that
> there is no console output.  Details, details!
> 
> Is there anything I can do to some combination of workqueues and printk
> to help debug this?  Or that I can do to anything else, as I am not
> feeling all that picky.  ;-)

It really sounds strange. Console drivers should not use workqueues
at least for console->write() callbacks. The main reason is that
workqueues do not work in panic(). But it might bring many problems
even when the system is running "normally"

You wrote above that there was absolutely no console output so the problem
was from the very beginning.

I looks like that the consoles get blocked either during registration
of the first console or when trying to flush the first message.

Consoles, except for early consoles, are registered by
console_initcall(). And init calls are proceed when the workqueues
subsystem is already initialized. So, workqueues might
somehow affect console driver initialization and registration.

My first though was that the workqueues change might somehow
block synchronize_srcu() which is used when updating console_list.
But it not called during console registration. It is called only
during suspend, resume, and console unregistration. So, it should
not block the first boot messages.

My current theory is that the change in workqueues might somehow
block a console driver initialization. I think that some console
drivers might use workqueues for some tasks even though they
could not use them in the write() callback.

Could you please provide console log when the problematic patch
is disabled? I wonder what consoles drivers are registered and when.

Just to be sure. Does the rcu torture configuration modify the
behavior of srcu_read_lock_nmisafe() which is used by
console_srcu_read_lock()?

Best Regards,
Petr