On Sat, May 07, 2022 at 08:29:31AM +0800, Yanjun Zhu wrote: > > If I try to run the SRP test 002 with the soft-RoCE driver, the > > following appears: > > > > [ 749.901966] ================================ > > [ 749.903638] WARNING: inconsistent lock state > > [ 749.905376] 5.18.0-rc5-dbg+ #1 Not tainted > > [ 749.907039] -------------------------------- > > [ 749.908699] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. > > [ 749.910646] ksoftirqd/5/40 [HC0[0]:SC1[1]:HE0:SE0] takes: > > [ 749.912499] ffff88818244d350 (&xa->xa_lock#14){+.?.}-{2:2}, at: > > rxe_pool_get_index+0x73/0x170 [rdma_rxe] > > [ 749.914691] {SOFTIRQ-ON-W} state was registered at: > > [ 749.916648] __lock_acquire+0x45b/0xce0 > > [ 749.918599] lock_acquire+0x18a/0x450 > > [ 749.920480] _raw_spin_lock+0x34/0x50 > > [ 749.922580] __rxe_add_to_pool+0xcc/0x140 [rdma_rxe] > > [ 749.924583] rxe_alloc_pd+0x2d/0x40 [rdma_rxe] > > [ 749.926394] __ib_alloc_pd+0xa3/0x270 [ib_core] > > [ 749.928579] ib_mad_port_open+0x44a/0x790 [ib_core] > > [ 749.930640] ib_mad_init_device+0x8e/0x110 [ib_core] > > [ 749.932495] add_client_context+0x26a/0x330 [ib_core] > > [ 749.934302] enable_device_and_get+0x169/0x2b0 [ib_core] > > [ 749.936217] ib_register_device+0x26f/0x330 [ib_core] > > [ 749.938020] rxe_register_device+0x1b4/0x1d0 [rdma_rxe] > > [ 749.939794] rxe_add+0x8c/0xc0 [rdma_rxe] > > [ 749.941552] rxe_net_add+0x5b/0x90 [rdma_rxe] > > [ 749.943356] rxe_newlink+0x71/0x80 [rdma_rxe] > > [ 749.945182] nldev_newlink+0x21e/0x370 [ib_core] > > [ 749.946917] rdma_nl_rcv_msg+0x200/0x410 [ib_core] > > [ 749.948657] rdma_nl_rcv+0x140/0x220 [ib_core] > > [ 749.950373] netlink_unicast+0x307/0x460 > > [ 749.952063] netlink_sendmsg+0x422/0x750 > > [ 749.953672] __sys_sendto+0x1c2/0x250 > > [ 749.955281] __x64_sys_sendto+0x7f/0x90 > > [ 749.956849] do_syscall_64+0x35/0x80 > > [ 749.958353] entry_SYSCALL_64_after_hwframe+0x44/0xae > > [ 749.959942] irq event stamp: 1411849 > > [ 749.961517] hardirqs last enabled at (1411848): [<ffffffff810cdb28>] > > __local_bh_enable_ip+0x88/0xf0 > > [ 749.963338] hardirqs last disabled at (1411849): [<ffffffff81ebf24d>] > > _raw_spin_lock_irqsave+0x5d/0x60 > > [ 749.965214] softirqs last enabled at (1411838): [<ffffffff82200467>] > > __do_softirq+0x467/0x6e1 > > [ 749.967027] softirqs last disabled at (1411843): [<ffffffff810cd947>] > > run_ksoftirqd+0x37/0x60 > To this, Please use this patch series > news://nntp.lore.kernel.org:119/20220422194416.983549-1-yanjun.zhu@xxxxxxxxx No, that is the wrong fix for this. This is mismatched lock modes with the lookup path in the BH, the fix is to consistently use BH locking with the xarray everwhere or to use RCU. I'm expecting to go with Bob's RCU patch. We still need a proper patch for the AH problem. Jason