RE: [syzbot] [rdma?] possible deadlock in siw_create_listen (2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: syzbot <syzbot+3eb27595de9aa3cf63c3@xxxxxxxxxxxxxxxxxxxxxxxxx>
> Sent: Thursday, September 26, 2024 3:34 PM
> To: Bernard Metzler <BMT@xxxxxxxxxxxxxx>; jgg@xxxxxxxx; leon@xxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx;
> netdev@xxxxxxxxxxxxxxx; syzkaller-bugs@xxxxxxxxxxxxxxxx
> Subject: [EXTERNAL] [syzbot] [rdma?] possible deadlock in siw_create_listen
> (2)
> 
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    5f5673607153 Merge branch 'for-next/core' into for-kernelci
> git tree:
> git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> console output: INVALID URI REMOVED
> 3A__syzkaller.appspot.com_x_log.txt-3Fx-
> 3D149fdca9980000&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4t
> YSbqxyOwdSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=xuvx4qT_oYipgtJx
> 0iJ1oKZQsCwdkBuRmnDShT45eOc&e=
> kernel config:  INVALID URI REMOVED
> 3A__syzkaller.appspot.com_x_.config-3Fx-
> 3Ddedbcb1ff4387972&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE
> 4tYSbqxyOwdSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=BjZg8UtYaAeXwr8W
> WxXuZ7A2QgccwxH4uGrmlPYBr0s&e=
> dashboard link: INVALID URI REMOVED
> 3A__syzkaller.appspot.com_bug-3Fextid-
> 3D3eb27595de9aa3cf63c3&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbh
> vovE4tYSbqxyOwdSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=Mxs76HbB1WLfXbF9
> s3ulaR8KJd6t1Uz4K5IRN64eFVo&e=
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for
> Debian) 2.40
> userspace arch: arm64
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> Downloadable assets:
> disk image: INVALID URI REMOVED
> 3A__storage.googleapis.com_syzbot-2Dassets_40172aed5414_disk-
> 2D5f567360.raw.xz&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4
> tYSbqxyOwdSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=HJ8gCHhTtZtGTRe7
> 3tc_YcCZD2qxh-xhZFSsDV_tetc&e=
> vmlinux: INVALID URI REMOVED
> 3A__storage.googleapis.com_syzbot-2Dassets_58372f305e9d_vmlinux-
> 2D5f567360.xz&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4tYSb
> qxyOwdSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=I_ky8ZO37Twppvej
> koUyZpbrQC4ZkwxoCPf7SSerSm4&e=
> kernel image: INVALID URI REMOVED
> 3A__storage.googleapis.com_syzbot-2Dassets_d2aae6fa798f_Image-
> 2D5f567360.gz.xz&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4t
> YSbqxyOwdSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=huA26Ba18XAmiroY
> x2AAfOapW2IOIdxGPh0_ay4obP8&e=
> 
> IMPORTANT: if you fix the issue, please add the following tag to the
> commit:
> Reported-by: syzbot+3eb27595de9aa3cf63c3@xxxxxxxxxxxxxxxxxxxxxxxxx
> 
> iwpm_register_pid: Unable to send a nlmsg (client = 2)
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.11.0-rc7-syzkaller-g5f5673607153 #0 Not tainted
> ------------------------------------------------------
> syz.4.157/7931 is trying to acquire lock:
> ffff0000ee056458 (sk_lock-AF_INET){+.+.}-{0:0}, at:
> siw_create_listen+0x164/0xd70 drivers/infiniband/sw/siw/siw_cm.c:1776
> 
> but task is already holding lock:
> ffff800091c21ea8 (lock#7){+.+.}-{3:3}, at: cma_add_one+0x510/0xab4
> drivers/infiniband/core/cma.c:5354
> 
> which lock already depends on the new lock.
> 

Could one please help me to understand this situation?
cma.c:5354

        mutex_lock(&lock);
        list_add_tail(&cma_dev->list, &dev_list);
        list_for_each_entry(id_priv, &listen_any_list, listen_any_item) {
                ret = cma_listen_on_dev(id_priv, cma_dev, &to_destroy);
                if (ret)
                        goto free_listen;
        }               
        mutex_unlock(&lock);

siw_cm.c:1776
	sock_set_reuseaddr(s->sk);

...which calls lock_sock(sk) on a feshly created socket.

I don't see the dependency between the global cma lock and the socket lock.

Any help appreciated!

Thanks,
Bernard.


> 
> the existing dependency chain (in reverse order) is:
> 
> -> #3 (lock#7){+.+.}-{3:3}:
>        __mutex_lock_common+0x190/0x21a0 kernel/locking/mutex.c:608
>        __mutex_lock kernel/locking/mutex.c:752 [inline]
>        mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:804
>        cma_init+0x2c/0x158 drivers/infiniband/core/cma.c:5438
>        do_one_initcall+0x24c/0x9c0 init/main.c:1267
>        do_initcall_level+0x154/0x214 init/main.c:1329
>        do_initcalls+0x58/0xac init/main.c:1345
>        do_basic_setup+0x8c/0xa0 init/main.c:1364
>        kernel_init_freeable+0x324/0x478 init/main.c:1578
>        kernel_init+0x24/0x2a0 init/main.c:1467
>        ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860
> 
> -> #2 (rtnl_mutex){+.+.}-{3:3}:
>        __mutex_lock_common+0x190/0x21a0 kernel/locking/mutex.c:608
>        __mutex_lock kernel/locking/mutex.c:752 [inline]
>        mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:804
>        rtnl_lock+0x20/0x2c net/core/rtnetlink.c:79
>        do_ip_setsockopt+0xe8c/0x346c net/ipv4/ip_sockglue.c:1077
>        ip_setsockopt+0x80/0x128 net/ipv4/ip_sockglue.c:1417
>        tcp_setsockopt+0xcc/0xe8 net/ipv4/tcp.c:3768
>        sock_common_setsockopt+0xb0/0xcc net/core/sock.c:3735
>        smc_setsockopt+0x204/0x10fc net/smc/af_smc.c:3072
>        do_sock_setsockopt+0x2a0/0x4e0 net/socket.c:2324
>        __sys_setsockopt+0x128/0x1a8 net/socket.c:2347
>        __do_sys_setsockopt net/socket.c:2356 [inline]
>        __se_sys_setsockopt net/socket.c:2353 [inline]
>        __arm64_sys_setsockopt+0xb8/0xd4 net/socket.c:2353
>        __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
>        invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
>        el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
>        do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
>        el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
>        el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
>        el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
> 
> -> #1 (&smc->clcsock_release_lock){+.+.}-{3:3}:
>        __mutex_lock_common+0x190/0x21a0 kernel/locking/mutex.c:608
>        __mutex_lock kernel/locking/mutex.c:752 [inline]
>        mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:804
>        smc_switch_to_fallback+0x48/0xa80 net/smc/af_smc.c:902
>        smc_sendmsg+0xfc/0x9f8 net/smc/af_smc.c:2779
>        sock_sendmsg_nosec net/socket.c:730 [inline]
>        __sock_sendmsg net/socket.c:745 [inline]
>        __sys_sendto+0x374/0x4f4 net/socket.c:2204
>        __do_sys_sendto net/socket.c:2216 [inline]
>        __se_sys_sendto net/socket.c:2212 [inline]
>        __arm64_sys_sendto+0xd8/0xf8 net/socket.c:2212
>        __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
>        invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
>        el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
>        do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
>        el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
>        el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
>        el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
> 
> -> #0 (sk_lock-AF_INET){+.+.}-{0:0}:
>        check_prev_add kernel/locking/lockdep.c:3133 [inline]
>        check_prevs_add kernel/locking/lockdep.c:3252 [inline]
>        validate_chain kernel/locking/lockdep.c:3868 [inline]
>        __lock_acquire+0x33d8/0x779c kernel/locking/lockdep.c:5142
>        lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
>        lock_sock_nested net/core/sock.c:3543 [inline]
>        lock_sock include/net/sock.h:1607 [inline]
>        sock_set_reuseaddr+0x58/0x154 net/core/sock.c:782
>        siw_create_listen+0x164/0xd70
> drivers/infiniband/sw/siw/siw_cm.c:1776
>        iw_cm_listen+0x14c/0x204 drivers/infiniband/core/iwcm.c:585
>        cma_iw_listen drivers/infiniband/core/cma.c:2668 [inline]
>        rdma_listen+0x774/0xae4 drivers/infiniband/core/cma.c:3953
>        cma_listen_on_dev+0x320/0x64c drivers/infiniband/core/cma.c:2727
>        cma_add_one+0x5ec/0xab4 drivers/infiniband/core/cma.c:5357
>        add_client_context+0x45c/0x7d0 drivers/infiniband/core/device.c:727
>        enable_device_and_get+0x1a8/0x3e8
> drivers/infiniband/core/device.c:1338
>        ib_register_device+0xe40/0x108c
> drivers/infiniband/core/device.c:1426
>        siw_device_register drivers/infiniband/sw/siw/siw_main.c:72 [inline]
>        siw_newlink+0x80c/0xc2c drivers/infiniband/sw/siw/siw_main.c:489
>        nldev_newlink+0x49c/0x4fc drivers/infiniband/core/nldev.c:1794
>        rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
>        rdma_nl_rcv+0x5c4/0x858 drivers/infiniband/core/netlink.c:259
>        netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
>        netlink_unicast+0x668/0x8a4 net/netlink/af_netlink.c:1357
>        netlink_sendmsg+0x7a4/0xa8c net/netlink/af_netlink.c:1901
>        sock_sendmsg_nosec net/socket.c:730 [inline]
>        __sock_sendmsg net/socket.c:745 [inline]
>        ____sys_sendmsg+0x56c/0x840 net/socket.c:2597
>        ___sys_sendmsg net/socket.c:2651 [inline]
>        __sys_sendmsg+0x26c/0x33c net/socket.c:2680
>        __do_sys_sendmsg net/socket.c:2689 [inline]
>        __se_sys_sendmsg net/socket.c:2687 [inline]
>        __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2687
>        __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
>        invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
>        el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
>        do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
>        el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
>        el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
>        el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
> 
> other info that might help us debug this:
> 
> Chain exists of:
>   sk_lock-AF_INET --> rtnl_mutex --> lock#7
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(lock#7);
>                                lock(rtnl_mutex);
>                                lock(lock#7);
>   lock(sk_lock-AF_INET);
> 
>  *** DEADLOCK ***
> 
> 6 locks held by syz.4.157/7931:
>  #0: ffff8000974142d8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at:
> rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:164 [inline]
>  #0: ffff8000974142d8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at:
> rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
>  #0: ffff8000974142d8 (&rdma_nl_types[idx].sem){.+.+}-{3:3}, at:
> rdma_nl_rcv+0x330/0x858 drivers/infiniband/core/netlink.c:259
>  #1: ffff800091c0e870 (link_ops_rwsem){++++}-{3:3}, at:
> nldev_newlink+0x358/0x4fc drivers/infiniband/core/nldev.c:1784
>  #2: ffff800091bff210 (devices_rwsem){++++}-{3:3}, at:
> enable_device_and_get+0x104/0x3e8 drivers/infiniband/core/device.c:1328
>  #3: ffff800091bff510 (clients_rwsem){++++}-{3:3}, at:
> enable_device_and_get+0x160/0x3e8 drivers/infiniband/core/device.c:1336
>  #4: ffff0000d61505d0 (&device->client_data_rwsem){++++}-{3:3}, at:
> add_client_context+0x424/0x7d0 drivers/infiniband/core/device.c:725
>  #5: ffff800091c21ea8 (lock#7){+.+.}-{3:3}, at: cma_add_one+0x510/0xab4
> drivers/infiniband/core/cma.c:5354
> 
> stack backtrace:
> CPU: 0 UID: 0 PID: 7931 Comm: syz.4.157 Not tainted 6.11.0-rc7-syzkaller-
> g5f5673607153 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 08/06/2024
> Call trace:
>  dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:319
>  show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:326
>  __dump_stack lib/dump_stack.c:93 [inline]
>  dump_stack_lvl+0xe4/0x150 lib/dump_stack.c:119
>  dump_stack+0x1c/0x28 lib/dump_stack.c:128
>  print_circular_bug+0x150/0x1b8 kernel/locking/lockdep.c:2059
>  check_noncircular+0x310/0x404 kernel/locking/lockdep.c:2186
>  check_prev_add kernel/locking/lockdep.c:3133 [inline]
>  check_prevs_add kernel/locking/lockdep.c:3252 [inline]
>  validate_chain kernel/locking/lockdep.c:3868 [inline]
>  __lock_acquire+0x33d8/0x779c kernel/locking/lockdep.c:5142
>  lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5759
>  lock_sock_nested net/core/sock.c:3543 [inline]
>  lock_sock include/net/sock.h:1607 [inline]
>  sock_set_reuseaddr+0x58/0x154 net/core/sock.c:782
>  siw_create_listen+0x164/0xd70 drivers/infiniband/sw/siw/siw_cm.c:1776
>  iw_cm_listen+0x14c/0x204 drivers/infiniband/core/iwcm.c:585
>  cma_iw_listen drivers/infiniband/core/cma.c:2668 [inline]
>  rdma_listen+0x774/0xae4 drivers/infiniband/core/cma.c:3953
>  cma_listen_on_dev+0x320/0x64c drivers/infiniband/core/cma.c:2727
>  cma_add_one+0x5ec/0xab4 drivers/infiniband/core/cma.c:5357
>  add_client_context+0x45c/0x7d0 drivers/infiniband/core/device.c:727
>  enable_device_and_get+0x1a8/0x3e8 drivers/infiniband/core/device.c:1338
>  ib_register_device+0xe40/0x108c drivers/infiniband/core/device.c:1426
>  siw_device_register drivers/infiniband/sw/siw/siw_main.c:72 [inline]
>  siw_newlink+0x80c/0xc2c drivers/infiniband/sw/siw/siw_main.c:489
>  nldev_newlink+0x49c/0x4fc drivers/infiniband/core/nldev.c:1794
>  rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
>  rdma_nl_rcv+0x5c4/0x858 drivers/infiniband/core/netlink.c:259
>  netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
>  netlink_unicast+0x668/0x8a4 net/netlink/af_netlink.c:1357
>  netlink_sendmsg+0x7a4/0xa8c net/netlink/af_netlink.c:1901
>  sock_sendmsg_nosec net/socket.c:730 [inline]
>  __sock_sendmsg net/socket.c:745 [inline]
>  ____sys_sendmsg+0x56c/0x840 net/socket.c:2597
>  ___sys_sendmsg net/socket.c:2651 [inline]
>  __sys_sendmsg+0x26c/0x33c net/socket.c:2680
>  __do_sys_sendmsg net/socket.c:2689 [inline]
>  __se_sys_sendmsg net/socket.c:2687 [inline]
>  __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2687
>  __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
>  invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
>  el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
>  do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
>  el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
>  el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
>  el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598
> infiniband syz1: RDMA CMA: cma_listen_on_dev, error -98
> overlay: ./file0 is not a directory
> xt_nfacct: accounting object `sy' does not exists
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See INVALID URI REMOVED
> 3A__goo.gl_tpsmEJ&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4
> tYSbqxyOwdSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=DA1M0kOP4c-
> 36riaoyaAE7WfF4I2V_cvru4PbF80xu4&e=  for more information about syzbot.
> syzbot engineers can be reached at syzkaller@xxxxxxxxxxxxxxxx.
> 
> syzbot will keep track of this issue. See:
> INVALID URI REMOVED
> 23status&d=DwIFaQ&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4tYSbqxyOw
> dSiLedP4yO55g&m=sr8zZDt-
> X4vizrmZ7oTOUCc_f4tzlpuv7bRCJEXpp32wPy_dhtBfJCqKKk2V7Bp0&s=HTvYoHo7kNGdhvI6
> p66EC7F21n9dIQYD3aC3N_qXllQ&e=  for how to communicate with syzbot.
> 
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
> 
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
> 
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
> 
> If you want to undo deduplication, reply with:
> #syz undup




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux