On Tue 2020-11-17 09:33:25, Steven Rostedt wrote: > On Tue, 17 Nov 2020 12:23:41 +0200 > Leon Romanovsky <leon@xxxxxxxxxx> wrote: > > > Hi, > > > > Approximately two weeks ago, our regression team started to experience those > > netconsole splats. The tested code is Linus's master (-rc4) + netdev net-next > > + netdev net-rc. > > > > Such splats are random and we can't bisect because there is no stable reproducer. > > > > Any idea, what is the root cause? > > > > [ 21.149739] __do_sys_finit_module+0xbc/0x12c > > [ 21.149740] __arm64_sys_finit_module+0x28/0x34 > > [ 21.149741] el0_svc_common.constprop.0+0x84/0x200 > > [ 21.149742] do_el0_svc+0x2c/0x90 > > [ 21.149743] el0_svc+0x18/0x50 > > [ 21.149744] el0_sync_handler+0xe0/0x350 > > [ 21.149745] el0_sync+0x158/0x180 > > [ 21.149746] } > > [ 21.149747] ... key at: [<ffff8000093d4018>] target_list_lock+0x18/0xfffffffffffff000 [netconsole] > > [ 21.149748] .. > > [ 21.149750] Lost 190 message(s)! > > It really sucks that we lose 190 messages that would help to decipher this > more. :-p The message commes from the printk_safe code. The size can be increased by CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT. > Because I'm not sure where the xmit_lock is taken while holding the > target_list_lock. But the above does show that printk() calls write_msg() > while holding the console_lock, and write_msg() takes the target_list_lock. > > Thus, the fix would ether require disabling interrupts every time the > xmit_lock is taken, or to get it from being taken while holding the > target_list_lock. It seems that the missing messages might help to find the root of the problem. Best Regards, Petr _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization