RE: [syzbot] [rdma?] possible deadlock in siw_create_listen (2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Jason Gunthorpe <jgg@xxxxxxxx>
> Sent: Saturday, October 5, 2024 3:21 AM
> To: Bernard Metzler <BMT@xxxxxxxxxxxxxx>
> Cc: leon@xxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx
> Subject: [EXTERNAL] Re: [syzbot] [rdma?] possible deadlock in
> siw_create_listen (2)
> 
> On Fri, Oct 04, 2024 at 04:10:31PM +0000, Bernard Metzler wrote:
> 
> > Could one please help me to understand this situation?
> > cma.c:5354
> >
> >         mutex_lock(&lock);
> >         list_add_tail(&cma_dev->list, &dev_list);
> >         list_for_each_entry(id_priv, &listen_any_list, listen_any_item) {
> >                 ret = cma_listen_on_dev(id_priv, cma_dev, &to_destroy);
> >                 if (ret)
> >                         goto free_listen;
> >         }
> >         mutex_unlock(&lock);
> >
> > siw_cm.c:1776
> > 	sock_set_reuseaddr(s->sk);
> >
> > ...which calls lock_sock(sk) on a feshly created socket.
> 
> I think this is a smc bug, and lockdep is getting confused about what
> to report due to all the different locks.
> 
> smc_setsockopt() eventually in ip_setsockopt() does:
> 
> 	mutex_lock(&smc->clcsock_release_lock);
> 
> 	if (needs_rtnl)
> 		rtnl_lock();
> 	sockopt_lock_sock(sk);
> 	mutex_unlock(&smc->clcsock_release_lock);
> 
> 
> smc_sendmsg() does
> 
> 	lock_sock(sk);
> 	mutex_lock(&smc->clcsock_release_lock);
> 
> Which is classic deadlock locking.
> 

Thank you for helping to clarify this. That would make much more sense.
So blaming

> siw_create_listen+0x164/0xd70 drivers/infiniband/sw/siw/siw_cm.c:1776

... isn't quite right. It doesn't deal with the SMC lock,
but locks a just created socket via 

>> -> #0 (sk_lock-AF_INET){+.+.}-{0:0}:
>>        check_prev_add kernel/locking/lockdep.c:3133 [inline]
>>        check_prevs_add kernel/locking/lockdep.c:3252 [inline]
>>        validate_chain kernel/locking/lockdep.c:3868 [inline]
>>        __lock_acquire+0x33d8/0x779c kernel/locking/lockdep.c:5142
>>        lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
>>        lock_sock_nested net/core/sock.c:3543 [inline]
>>        lock_sock include/net/sock.h:1607 [inline]
>>        sock_set_reuseaddr+0x58/0x154 net/core/sock.c:782
>>        siw_create_listen+0x164/0xd70

> That the CMA gets involved here seems like wrong reporting because
> syzkaller put those lock chains into it.
> 
> I guess this is a dup of
> 
> INVALID URI REMOVED
> 3A__lore.kernel.org_netdev_00000000000093078f0622583e6e-
> 40google.com_T_&d=DwIBAg&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4tY
> SbqxyOwdSiLedP4yO55g&m=JpX-DX-70KCh-9MzDE4Yt0wOtrMj03iWWukt_A_7qB2ycm-
> IeacSCUUDTQ5MS24-&s=DQc776KI863HX_sKom7kci4ykIgXdN7skIMVbWS1Hjc&e=
> 
> Or at least that should be fixed before looking at this
> 
Sounds reasonable...

Thanks!
Bernard.




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux