Paulo Alcantara <pc@xxxxxxxxxxxxx> writes: >> @@ -515,7 +573,18 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v) >> seq_printf(m, "\n\n\tExtra Channels: %zu ", >> ses->chan_count-1); >> for (j = 1; j < ses->chan_count; j++) { >> + /* >> + * kernel_getsockname can block inside >> + * cifs_dump_channel. so drop the lock first >> + */ >> + server->srv_count++; >> + spin_unlock(&cifs_tcp_ses_lock); >> + >> cifs_dump_channel(m, j, &ses->chans[j]); >> + >> + cifs_put_tcp_session(server, 0); >> + spin_lock(&cifs_tcp_ses_lock); > > Here you are re-acquiring @cifs_tcp_ses_lock spinlock under > @ses->chan_lock, which will introduce deadlocks in threads calling > cifs_match_super(), cifs_signal_cifsd_for_reconnect(), > cifs_mark_tcp_ses_conns_for_reconnect(), cifs_find_smb_ses(), ... A simple reproducer $ mount.cifs //srv/share /mnt -o ...,multichannel $ cat /proc/fs/cifs/DebugData [ 1293.512572] BUG: sleeping function called from invalid context at net/core/sock.c:3507 [ 1293.513915] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1068, name: cat [ 1293.515381] preempt_count: 1, expected: 0 [ 1293.516321] RCU nest depth: 0, expected: 0 [ 1293.517294] 3 locks held by cat/1068: [ 1293.518165] #0: ffff88800818fc48 (&p->lock){+.+.}-{3:3}, at: seq_read_iter+0x59/0x470 [ 1293.519383] #1: ffff88800aed2b28 (&ret_buf->chan_lock){+.+.}-{2:2}, at: cifs_debug_data_proc_show+0x555/0xee0 [cifs] [ 1293.520865] #2: ffff888011c9a540 (sk_lock-AF_INET-CIFS){+.+.}-{0:0}, at: inet_getname+0x29/0xa0 [ 1293.522098] CPU: 3 PID: 1068 Comm: cat Not tainted 6.6.0-rc7 #2 [ 1293.522901] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 [ 1293.524368] Call Trace: [ 1293.524711] <TASK> [ 1293.525015] dump_stack_lvl+0x64/0x80 [ 1293.525519] __might_resched+0x173/0x280 [ 1293.526059] lock_sock_nested+0x43/0x80 [ 1293.526578] ? inet_getname+0x29/0xa0 [ 1293.527097] inet_getname+0x29/0xa0 [ 1293.527584] cifs_debug_data_proc_show+0xcf9/0xee0 [cifs] [ 1293.528360] seq_read_iter+0x118/0x470 [ 1293.528877] proc_reg_read_iter+0x53/0x90 [ 1293.529419] ? srso_alias_return_thunk+0x5/0x7f [ 1293.530037] vfs_read+0x201/0x350 [ 1293.530507] ksys_read+0x75/0x100 [ 1293.530968] do_syscall_64+0x3f/0x90 [ 1293.531461] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 1293.532138] RIP: 0033:0x7f71d767e381 [ 1293.532630] Code: ff ff eb c3 e8 0e ea 01 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 90 90 80 3d a5 f6 0e 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 57 c3 66 0f 1f 44 00 00 48 83 ec 28 48 89 54 [ 1293.535095] RSP: 002b:00007ffc312d65a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 1293.536106] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f71d767e381 [ 1293.537056] RDX: 0000000000020000 RSI: 00007f71d74f8000 RDI: 0000000000000003 [ 1293.538003] RBP: 0000000000020000 R08: 00000000ffffffff R09: 0000000000000000 [ 1293.538957] R10: 0000000000000022 R11: 0000000000000246 R12: 00007f71d74f8000 [ 1293.539908] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000020000 [ 1293.540877] </TASK> [ 1293.541233] [ 1293.541449] ====================================================== [ 1293.542270] WARNING: possible circular locking dependency detected [ 1293.543098] 6.6.0-rc7 #2 Tainted: G W [ 1293.543782] ------------------------------------------------------ [ 1293.544606] cat/1068 is trying to acquire lock: [ 1293.545214] ffffffffc015b5f8 (&cifs_tcp_ses_lock){+.+.}-{2:2}, at: cifs_put_tcp_session+0x1c/0x180 [cifs] [ 1293.546516] [ 1293.546516] but task is already holding lock: [ 1293.547292] ffff88800aed2b28 (&ret_buf->chan_lock){+.+.}-{2:2}, at: cifs_debug_data_proc_show+0x555/0xee0 [cifs] [ 1293.548454] [ 1293.548454] which lock already depends on the new lock. [ 1293.548454] [ 1293.549350] [ 1293.549350] the existing dependency chain (in reverse order) is: [ 1293.550183] [ 1293.550183] -> #1 (&ret_buf->chan_lock){+.+.}-{2:2}: [ 1293.550899] _raw_spin_lock+0x34/0x80 [ 1293.551401] cifs_debug_data_proc_show+0x555/0xee0 [cifs] [ 1293.552082] seq_read_iter+0x118/0x470 [ 1293.552556] proc_reg_read_iter+0x53/0x90 [ 1293.553054] vfs_read+0x201/0x350 [ 1293.553490] ksys_read+0x75/0x100 [ 1293.553925] do_syscall_64+0x3f/0x90 [ 1293.554389] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 1293.555004] [ 1293.555004] -> #0 (&cifs_tcp_ses_lock){+.+.}-{2:2}: [ 1293.555709] __lock_acquire+0x1521/0x2660 [ 1293.556218] lock_acquire+0xbf/0x2b0 [ 1293.556680] _raw_spin_lock+0x34/0x80 [ 1293.557148] cifs_put_tcp_session+0x1c/0x180 [cifs] [ 1293.557773] cifs_debug_data_proc_show+0xd15/0xee0 [cifs] [ 1293.558463] seq_read_iter+0x118/0x470 [ 1293.558945] proc_reg_read_iter+0x53/0x90 [ 1293.559450] vfs_read+0x201/0x350 [ 1293.559882] ksys_read+0x75/0x100 [ 1293.560317] do_syscall_64+0x3f/0x90 [ 1293.560773] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 1293.561390] [ 1293.561390] other info that might help us debug this: [ 1293.561390] [ 1293.562267] Possible unsafe locking scenario: [ 1293.562267] [ 1293.562927] CPU0 CPU1 [ 1293.563394] ---- ---- [ 1293.563754] lock(&ret_buf->chan_lock); [ 1293.564068] lock(&cifs_tcp_ses_lock); [ 1293.564573] lock(&ret_buf->chan_lock); [ 1293.565077] lock(&cifs_tcp_ses_lock); [ 1293.565387] [ 1293.565387] *** DEADLOCK *** [ 1293.565387] [ 1293.565852] 2 locks held by cat/1068: [ 1293.566147] #0: ffff88800818fc48 (&p->lock){+.+.}-{3:3}, at: seq_read_iter+0x59/0x470 [ 1293.566767] #1: ffff88800aed2b28 (&ret_buf->chan_lock){+.+.}-{2:2}, at: cifs_debug_data_proc_show+0x555/0xee0 [cifs] [ 1293.567611] [ 1293.567611] stack backtrace: [ 1293.567954] CPU: 3 PID: 1068 Comm: cat Tainted: G W 6.6.0-rc7 #2 [ 1293.568536] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 [ 1293.569387] Call Trace: [ 1293.569585] <TASK> [ 1293.569755] dump_stack_lvl+0x4a/0x80 [ 1293.570047] check_noncircular+0x14e/0x170 [ 1293.570373] ? save_trace+0x3e/0x390 [ 1293.570659] __lock_acquire+0x1521/0x2660 [ 1293.570982] lock_acquire+0xbf/0x2b0 [ 1293.571268] ? cifs_put_tcp_session+0x1c/0x180 [cifs] [ 1293.571687] _raw_spin_lock+0x34/0x80 [ 1293.571977] ? cifs_put_tcp_session+0x1c/0x180 [cifs] [ 1293.572394] cifs_put_tcp_session+0x1c/0x180 [cifs] [ 1293.572795] cifs_debug_data_proc_show+0xd15/0xee0 [cifs] [ 1293.573241] seq_read_iter+0x118/0x470 [ 1293.573546] proc_reg_read_iter+0x53/0x90 [ 1293.573861] ? srso_alias_return_thunk+0x5/0x7f [ 1293.574218] vfs_read+0x201/0x350 [ 1293.574489] ksys_read+0x75/0x100 [ 1293.574752] do_syscall_64+0x3f/0x90 [ 1293.575030] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 1293.575428] RIP: 0033:0x7f71d767e381 [ 1293.575716] Code: ff ff eb c3 e8 0e ea 01 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 90 90 80 3d a5 f6 0e 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 57 c3 66 0f 1f 44 00 00 48 83 ec 28 48 89 54 [ 1293.577151] RSP: 002b:00007ffc312d65a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 1293.577736] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f71d767e381 [ 1293.578286] RDX: 0000000000020000 RSI: 00007f71d74f8000 RDI: 0000000000000003 [ 1293.578839] RBP: 0000000000020000 R08: 00000000ffffffff R09: 0000000000000000 [ 1293.579391] R10: 0000000000000022 R11: 0000000000000246 R12: 00007f71d74f8000 [ 1293.579951] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000020000 [ 1293.580511] </TASK>