On 2/13/25 17:11, Ido Schimmel wrote: > On Wed, Feb 12, 2025 at 01:41:51PM +0000, Gavrilov Ilia wrote: >> Syzkaller reports the following bug: >> >> BUG: spinlock bad magic on CPU#1, syz-executor.0/7995 >> lock: 0xffff88805303f3e0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 >> CPU: 1 PID: 7995 Comm: syz-executor.0 Tainted: G E 5.10.209+ #1 >> Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 >> Call Trace: >> __dump_stack lib/dump_stack.c:77 [inline] >> dump_stack+0x119/0x179 lib/dump_stack.c:118 >> debug_spin_lock_before kernel/locking/spinlock_debug.c:83 [inline] >> do_raw_spin_lock+0x1f6/0x270 kernel/locking/spinlock_debug.c:112 >> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:117 [inline] >> _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159 >> reset_per_cpu_data+0xe6/0x240 [drop_monitor] >> net_dm_cmd_trace+0x43d/0x17a0 [drop_monitor] >> genl_family_rcv_msg_doit+0x22f/0x330 net/netlink/genetlink.c:739 >> genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] >> genl_rcv_msg+0x341/0x5a0 net/netlink/genetlink.c:800 >> netlink_rcv_skb+0x14d/0x440 net/netlink/af_netlink.c:2497 >> genl_rcv+0x29/0x40 net/netlink/genetlink.c:811 >> netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline] >> netlink_unicast+0x54b/0x800 net/netlink/af_netlink.c:1348 >> netlink_sendmsg+0x914/0xe00 net/netlink/af_netlink.c:1916 >> sock_sendmsg_nosec net/socket.c:651 [inline] >> __sock_sendmsg+0x157/0x190 net/socket.c:663 >> ____sys_sendmsg+0x712/0x870 net/socket.c:2378 >> ___sys_sendmsg+0xf8/0x170 net/socket.c:2432 >> __sys_sendmsg+0xea/0x1b0 net/socket.c:2461 >> do_syscall_64+0x30/0x40 arch/x86/entry/common.c:46 >> entry_SYSCALL_64_after_hwframe+0x62/0xc7 >> RIP: 0033:0x7f3f9815aee9 >> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 >> RSP: 002b:00007f3f972bf0c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e >> RAX: ffffffffffffffda RBX: 00007f3f9826d050 RCX: 00007f3f9815aee9 >> RDX: 0000000020000000 RSI: 0000000020001300 RDI: 0000000000000007 >> RBP: 00007f3f981b63bd R08: 0000000000000000 R09: 0000000000000000 >> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 >> R13: 000000000000006e R14: 00007f3f9826d050 R15: 00007ffe01ee6768 >> >> If drop_monitor is built as a kernel module, syzkaller may have time >> to send a netlink NET_DM_CMD_START message during the module loading. >> This will call the net_dm_monitor_start() function that uses >> a spinlock that has not yet been initialized. >> >> To fix this, let's place resource initialization above the registration >> of a generic netlink family. >> >> Found by InfoTeCS on behalf of Linux Verification Center >> (linuxtesting.org) with SVACE. >> >> Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol") >> Cc: stable@xxxxxxxxxxxxxxx >> Signed-off-by: Ilia Gavrilov <Ilia.Gavrilov@xxxxxxxxxxx> >> --- >> net/core/drop_monitor.c | 13 ++++++------- >> 1 file changed, 6 insertions(+), 7 deletions(-) >> >> diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c >> index 6efd4cccc9dd..9755d2010e70 100644 >> --- a/net/core/drop_monitor.c >> +++ b/net/core/drop_monitor.c >> @@ -1734,6 +1734,11 @@ static int __init init_net_drop_monitor(void) >> return -ENOSPC; >> } >> >> + for_each_possible_cpu(cpu) { >> + net_dm_cpu_data_init(cpu); >> + net_dm_hw_cpu_data_init(cpu); >> + } >> + >> rc = genl_register_family(&net_drop_monitor_family); > > This might be fine as-is, but it would be cleaner to move the > registration of the netlink family to the end of the module > initialization function, so that it's exposed to user space after all > the preparations have been done, including the registration of the net > device notifier. > Thank you for your review. I'll send it in V2. >> if (rc) { >> pr_err("Could not create drop monitor netlink family\n"); >> @@ -1749,11 +1754,6 @@ static int __init init_net_drop_monitor(void) >> >> rc = 0; >> >> - for_each_possible_cpu(cpu) { >> - net_dm_cpu_data_init(cpu); >> - net_dm_hw_cpu_data_init(cpu); >> - } >> - >> goto out; >> >> out_unreg: >> @@ -1772,13 +1772,12 @@ static void exit_net_drop_monitor(void) >> * Because of the module_get/put we do in the trace state change path >> * we are guaranteed not to have any current users when we get here >> */ >> + BUG_ON(genl_unregister_family(&net_drop_monitor_family)); > > Similarly, unregister the netlink family at the beginning of the module > exit function, before unregistering the net device notifier. > >> >> for_each_possible_cpu(cpu) { >> net_dm_hw_cpu_data_fini(cpu); >> net_dm_cpu_data_fini(cpu); >> } >> - >> - BUG_ON(genl_unregister_family(&net_drop_monitor_family)); >> } >> >> module_init(init_net_drop_monitor); >> -- >> 2.39.5 >>