Re: kernel splat during boot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 18, 2022 at 05:10:47PM -0700, Alexei Starovoitov wrote:
> Hi Paul,
> 
> I see the following splat just booting net-next or bpf-next trees.
> I have lockdep and kasan on (if that matters).
> Is this a known issue?
> 
> [    3.011826] cblist_init_generic: Setting adjustable number of
> callback queues.
> [    3.011880]
> [    3.011883] =============================
> [    3.011885] [ BUG: Invalid wait context ]
> [    3.011889] 5.19.0-14019-g75179e2b7f9a #4201 Not tainted
> [    3.011893] -----------------------------
> [    3.011896] swapper/0/1 is trying to lock:
> [    3.011899] ffffffff85be94b8 (&port_lock_key){....}-{3:3}, at:
> serial8250_console_write+0x5fc/0x640
> [    3.011929] other info that might help us debug this:
> [    3.011931] context-{5:5}
> [    3.011934] 3 locks held by swapper/0/1:
> [    3.011938]  #0: ffffffff8404dae0
> (rcu_tasks_rude.cbs_gbl_lock){....}-{2:2}, at:
> cblist_init_generic+0x27/0x340
> [    3.011964]  #1: ffffffff84041d80 (console_lock){+.+.}-{0:0}, at:
> vprintk_emit+0xda/0x2e0
> [    3.011984]  #2: ffffffff839416e0 (console_owner){....}-{0:0}, at:
> console_emit_next_record.constprop.37+0x1c9/0x4d0
> [    3.012004] stack backtrace:
> [    3.012007] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> 5.19.0-14019-g75179e2b7f9a #4201
> [    3.012011] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> [    3.012011] Call Trace:
> [    3.012011]  <TASK>
> [    3.012011]  dump_stack_lvl+0x44/0x57
> [    3.012011]  __lock_acquire.cold.73+0xc7/0x31b
> [    3.012011]  ? lockdep_hardirqs_on_prepare+0x1f0/0x1f0
> [    3.012011]  ? rcu_read_lock_sched_held+0x91/0xc0
> [    3.012011]  ? rcu_read_lock_bh_held+0xa0/0xa0
> [    3.012011]  lock_acquire+0x133/0x380
> [    3.012011]  ? serial8250_console_write+0x5fc/0x640
> [    3.012011]  ? lock_release+0x3b0/0x3b0
> [    3.012011]  _raw_spin_lock_irqsave+0x35/0x50
> [    3.012011]  ? serial8250_console_write+0x5fc/0x640
> [    3.012011]  serial8250_console_write+0x5fc/0x640
> [    3.012011]  ? rcu_read_lock_bh_held+0xa0/0xa0
> [    3.012011]  ? serial8250_default_handle_irq+0x80/0x80
> [    3.012011]  ? lock_release+0x3b0/0x3b0
> [    3.012011]  ? do_raw_spin_lock+0x107/0x1c0
> [    3.012011]  ? rwlock_bug.part.2+0x60/0x60
> [    3.012011]  ? prb_final_commit+0x50/0x50
> [    3.012011]  console_emit_next_record.constprop.37+0x271/0x4d0
> [    3.012011]  ? info_print_ext_header.constprop.38+0x110/0x110
> [    3.012011]  ? rcu_read_lock_sched_held+0x91/0xc0
> [    3.012011]  console_unlock+0x1d8/0x2c0
> [    3.012011]  ? devkmsg_open+0x170/0x170
> [    3.012011]  ? vprintk_emit+0xda/0x2e0
> [    3.012011]  vprintk_emit+0xe3/0x2e0
> [    3.012011]  _printk+0x96/0xb0
> [    3.012011]  ? pm_suspend.cold.6+0x2e3/0x2e3
> [    3.012011]  ? _raw_write_unlock+0x1f/0x30
> [    3.012011]  ? do_raw_spin_lock+0x107/0x1c0
> [    3.012011]  cblist_init_generic.cold.36+0x24/0x32
> [    3.012011]  rcu_init_tasks_generic+0x23/0xf0
> [    3.012011]  kernel_init_freeable+0x1d3/0x38a
> [    3.012011]  ? _raw_spin_unlock_irq+0x24/0x30
> [    3.012011]  ? rest_init+0x1d0/0x1d0
> [    3.012011]  kernel_init+0x18/0x130
> [    3.012011]  ? rest_init+0x1d0/0x1d0
> [    3.012011]  ret_from_fork+0x1f/0x30
> [    3.012011]  </TASK>
> [    3.012029] cblist_init_generic: Setting shift to 3 and lim to 1.
> [    3.013301] cblist_init_generic: Setting shift to 3 and lim to 1.
> [    3.014203] Running RCU-tasks wait API self tests

Does the the patch from Shigeru Yoshida shown below help?

This patch is in my view a workaround for printf() losing the ability
to print from interrupts-diabled sections of code, but maybe it will be
of help to you.

							Thanx, Paul

------------------------------------------------------------------------

>From SRS0=JFiA=YG=redhat.com=syoshida@xxxxxxxxxx Tue Aug  2 09:31:25 2022
From: Shigeru Yoshida <syoshida@xxxxxxxxxx>
To: paulmck@xxxxxxxxxx,
	frederic@xxxxxxxxxx,
	neeraju@xxxxxxxxxxx,
	josh@xxxxxxxxxxxxxxxx
Cc: rostedt@xxxxxxxxxxx,
	mathieu.desnoyers@xxxxxxxxxxxx,
	jiangshanlai@xxxxxxxxx,
	joel@xxxxxxxxxxxxxxxxx,
	rcu@xxxxxxxxxxxxxxx,
	linux-kernel@xxxxxxxxxxxxxxx,
	Shigeru Yoshida <syoshida@xxxxxxxxxx>
Subject: [PATCH] rcu-tasks: Avoid pr_info() with spin lock in cblist_init_generic()
Date: Wed,  3 Aug 2022 01:22:05 +0900
Message-Id: <20220802162205.817796-1-syoshida@xxxxxxxxxx>

pr_info() is called with rtp->cbs_gbl_lock spin lock locked.  Because
pr_info() calls printk() that might sleep, this will result in BUG
like below:

[    0.206455] cblist_init_generic: Setting adjustable number of callback queues.
[    0.206463]
[    0.206464] =============================
[    0.206464] [ BUG: Invalid wait context ]
[    0.206465] 5.19.0-00428-g9de1f9c8ca51 #5 Not tainted
[    0.206466] -----------------------------
[    0.206466] swapper/0/1 is trying to lock:
[    0.206467] ffffffffa0167a58 (&port_lock_key){....}-{3:3}, at: serial8250_console_write+0x327/0x4a0
[    0.206473] other info that might help us debug this:
[    0.206473] context-{5:5}
[    0.206474] 3 locks held by swapper/0/1:
[    0.206474]  #0: ffffffff9eb597e0 (rcu_tasks.cbs_gbl_lock){....}-{2:2}, at: cblist_init_generic.constprop.0+0x14/0x1f0
[    0.206478]  #1: ffffffff9eb579c0 (console_lock){+.+.}-{0:0}, at: _printk+0x63/0x7e
[    0.206482]  #2: ffffffff9ea77780 (console_owner){....}-{0:0}, at: console_emit_next_record.constprop.0+0x111/0x330
[    0.206485] stack backtrace:
[    0.206486] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-00428-g9de1f9c8ca51 #5
[    0.206488] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
[    0.206489] Call Trace:
[    0.206490]  <TASK>
[    0.206491]  dump_stack_lvl+0x6a/0x9f
[    0.206493]  __lock_acquire.cold+0x2d7/0x2fe
[    0.206496]  ? stack_trace_save+0x46/0x70
[    0.206497]  lock_acquire+0xd1/0x2f0
[    0.206499]  ? serial8250_console_write+0x327/0x4a0
[    0.206500]  ? __lock_acquire+0x5c7/0x2720
[    0.206502]  _raw_spin_lock_irqsave+0x3d/0x90
[    0.206504]  ? serial8250_console_write+0x327/0x4a0
[    0.206506]  serial8250_console_write+0x327/0x4a0
[    0.206508]  console_emit_next_record.constprop.0+0x180/0x330
[    0.206511]  console_unlock+0xf7/0x1f0
[    0.206512]  vprintk_emit+0xf7/0x330
[    0.206514]  _printk+0x63/0x7e
[    0.206516]  cblist_init_generic.constprop.0.cold+0x24/0x32
[    0.206518]  rcu_init_tasks_generic+0x5/0xd9
[    0.206522]  kernel_init_freeable+0x15b/0x2a2
[    0.206523]  ? rest_init+0x160/0x160
[    0.206526]  kernel_init+0x11/0x120
[    0.206527]  ret_from_fork+0x1f/0x30
[    0.206530]  </TASK>
[    0.207018] cblist_init_generic: Setting shift to 1 and lim to 1.

This patch moves pr_info() so that it is called without
rtp->cbs_gbl_lock locked.

Signed-off-by: Shigeru Yoshida <syoshida@xxxxxxxxxx>
---
 kernel/rcu/tasks.h | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 3925e32159b5..d46dd970bf22 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -227,7 +227,6 @@ static void cblist_init_generic(struct rcu_tasks *rtp)
 	if (rcu_task_enqueue_lim < 0) {
 		rcu_task_enqueue_lim = 1;
 		rcu_task_cb_adjust = true;
-		pr_info("%s: Setting adjustable number of callback queues.\n", __func__);
 	} else if (rcu_task_enqueue_lim == 0) {
 		rcu_task_enqueue_lim = 1;
 	}
@@ -256,6 +255,10 @@ static void cblist_init_generic(struct rcu_tasks *rtp)
 		raw_spin_unlock_rcu_node(rtpcp); // irqs remain disabled.
 	}
 	raw_spin_unlock_irqrestore(&rtp->cbs_gbl_lock, flags);
+
+	if (rcu_task_cb_adjust)
+		pr_info("%s: Setting adjustable number of callback queues.\n", __func__);
+
 	pr_info("%s: Setting shift to %d and lim to %d.\n", __func__, data_race(rtp->percpu_enqueue_shift), data_race(rtp->percpu_enqueue_lim));
 }
 
-- 
2.37.1





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux