Re: Apparent regression in blktests since 5.18-rc1+

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, May 07, 2022 at 08:29:31AM +0800, Yanjun Zhu wrote:

> > If I try to run the SRP test 002 with the soft-RoCE driver, the
> > following appears:
> > 
> > [  749.901966] ================================
> > [  749.903638] WARNING: inconsistent lock state
> > [  749.905376] 5.18.0-rc5-dbg+ #1 Not tainted
> > [  749.907039] --------------------------------
> > [  749.908699] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> > [  749.910646] ksoftirqd/5/40 [HC0[0]:SC1[1]:HE0:SE0] takes:
> > [  749.912499] ffff88818244d350 (&xa->xa_lock#14){+.?.}-{2:2}, at:
> > rxe_pool_get_index+0x73/0x170 [rdma_rxe]
> > [  749.914691] {SOFTIRQ-ON-W} state was registered at:
> > [  749.916648]   __lock_acquire+0x45b/0xce0
> > [  749.918599]   lock_acquire+0x18a/0x450
> > [  749.920480]   _raw_spin_lock+0x34/0x50
> > [  749.922580]   __rxe_add_to_pool+0xcc/0x140 [rdma_rxe]
> > [  749.924583]   rxe_alloc_pd+0x2d/0x40 [rdma_rxe]
> > [  749.926394]   __ib_alloc_pd+0xa3/0x270 [ib_core]
> > [  749.928579]   ib_mad_port_open+0x44a/0x790 [ib_core]
> > [  749.930640]   ib_mad_init_device+0x8e/0x110 [ib_core]
> > [  749.932495]   add_client_context+0x26a/0x330 [ib_core]
> > [  749.934302]   enable_device_and_get+0x169/0x2b0 [ib_core]
> > [  749.936217]   ib_register_device+0x26f/0x330 [ib_core]
> > [  749.938020]   rxe_register_device+0x1b4/0x1d0 [rdma_rxe]
> > [  749.939794]   rxe_add+0x8c/0xc0 [rdma_rxe]
> > [  749.941552]   rxe_net_add+0x5b/0x90 [rdma_rxe]
> > [  749.943356]   rxe_newlink+0x71/0x80 [rdma_rxe]
> > [  749.945182]   nldev_newlink+0x21e/0x370 [ib_core]
> > [  749.946917]   rdma_nl_rcv_msg+0x200/0x410 [ib_core]
> > [  749.948657]   rdma_nl_rcv+0x140/0x220 [ib_core]
> > [  749.950373]   netlink_unicast+0x307/0x460
> > [  749.952063]   netlink_sendmsg+0x422/0x750
> > [  749.953672]   __sys_sendto+0x1c2/0x250
> > [  749.955281]   __x64_sys_sendto+0x7f/0x90
> > [  749.956849]   do_syscall_64+0x35/0x80
> > [  749.958353]   entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [  749.959942] irq event stamp: 1411849
> > [  749.961517] hardirqs last  enabled at (1411848): [<ffffffff810cdb28>]
> > __local_bh_enable_ip+0x88/0xf0
> > [  749.963338] hardirqs last disabled at (1411849): [<ffffffff81ebf24d>]
> > _raw_spin_lock_irqsave+0x5d/0x60
> > [  749.965214] softirqs last  enabled at (1411838): [<ffffffff82200467>]
> > __do_softirq+0x467/0x6e1
> > [  749.967027] softirqs last disabled at (1411843): [<ffffffff810cd947>]
> > run_ksoftirqd+0x37/0x60
> To this, Please use this patch series
> news://nntp.lore.kernel.org:119/20220422194416.983549-1-yanjun.zhu@xxxxxxxxx

No, that is the wrong fix for this. This is mismatched lock modes with
the lookup path in the BH, the fix is to consistently use BH locking
with the xarray everwhere or to use RCU. I'm expecting to go with
Bob's RCU patch.

We still need a proper patch for the AH problem.

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux