On Mon, Nov 22, 2021 at 08:32:53PM +0800, Wen Gu wrote: > Possible recursive locking is detected by lockdep when SMC > falls back to TCP. The corresponding warnings are as follows: > > ============================================ > WARNING: possible recursive locking detected > 5.16.0-rc1+ #18 Tainted: G E > -------------------------------------------- > wrk/1391 is trying to acquire lock: > ffff975246c8e7d8 (&ei->socket.wq.wait){..-.}-{3:3}, at: smc_switch_to_fallback+0x109/0x250 [smc] > > but task is already holding lock: > ffff975246c8f918 (&ei->socket.wq.wait){..-.}-{3:3}, at: smc_switch_to_fallback+0xfe/0x250 [smc] > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > ---- > lock(&ei->socket.wq.wait); > lock(&ei->socket.wq.wait); > > *** DEADLOCK *** > > May be due to missing lock nesting notation > > 2 locks held by wrk/1391: > #0: ffff975246040130 (sk_lock-AF_SMC){+.+.}-{0:0}, at: smc_connect+0x43/0x150 [smc] > #1: ffff975246c8f918 (&ei->socket.wq.wait){..-.}-{3:3}, at: smc_switch_to_fallback+0xfe/0x250 [smc] > > stack backtrace: > Call Trace: > <TASK> > dump_stack_lvl+0x56/0x7b > __lock_acquire+0x951/0x11f0 > lock_acquire+0x27a/0x320 > ? smc_switch_to_fallback+0x109/0x250 [smc] > ? smc_switch_to_fallback+0xfe/0x250 [smc] > _raw_spin_lock_irq+0x3b/0x80 > ? smc_switch_to_fallback+0x109/0x250 [smc] > smc_switch_to_fallback+0x109/0x250 [smc] > smc_connect_fallback+0xe/0x30 [smc] > __smc_connect+0xcf/0x1090 [smc] > ? mark_held_locks+0x61/0x80 > ? __local_bh_enable_ip+0x77/0xe0 > ? lockdep_hardirqs_on+0xbf/0x130 > ? smc_connect+0x12a/0x150 [smc] > smc_connect+0x12a/0x150 [smc] > __sys_connect+0x8a/0xc0 > ? syscall_enter_from_user_mode+0x20/0x70 > __x64_sys_connect+0x16/0x20 > do_syscall_64+0x34/0x90 > entry_SYSCALL_64_after_hwframe+0x44/0xae > > The nested locking in smc_switch_to_fallback() is considered to > possibly cause a deadlock because smc_wait->lock and clc_wait->lock > are the same type of lock. But actually it is safe so far since > there is no other place trying to obtain smc_wait->lock when > clc_wait->lock is held. So the patch replaces spin_lock() with > spin_lock_nested() to avoid false report by lockdep. > > Link: https://lkml.org/lkml/2021/11/19/962 > Fixes: 2153bd1e3d3d ("Transfer remaining wait queue entries during fallback") > Reported-by: syzbot+e979d3597f48262cb4ee@xxxxxxxxxxxxxxxxxxxxxxxxx > Signed-off-by: Wen Gu <guwen@xxxxxxxxxxxxxxxxx> Acked-by: Tony Lu <tonylu@xxxxxxxxxxxxxxxxx> > --- > net/smc/af_smc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c > index b61c802..2692cba 100644 > --- a/net/smc/af_smc.c > +++ b/net/smc/af_smc.c > @@ -585,7 +585,7 @@ static void smc_switch_to_fallback(struct smc_sock *smc, int reason_code) > * to clcsocket->wq during the fallback. > */ > spin_lock_irqsave(&smc_wait->lock, flags); > - spin_lock(&clc_wait->lock); > + spin_lock_nested(&clc_wait->lock, SINGLE_DEPTH_NESTING); > list_splice_init(&smc_wait->head, &clc_wait->head); > spin_unlock(&clc_wait->lock); > spin_unlock_irqrestore(&smc_wait->lock, flags); > -- > 1.8.3.1