On Fri, Jun 24, 2022 at 12:02:53PM +0800, Li Zhijian wrote: > srp_exit_cmd_priv() will try to access srp_device by Scsi_Host like below: > > Scsi_Host srp_target_port srp_host srp_device > +------------------+ +-- +--------------+ +>+----------+ +->+---------+ > | | | | | | | | | | | > | | | | *srp_host +--+ | *srp_dev +---+ | *dev | > +-+hostdata--------+-+ | | | | | | > | | srp_target_port| | | | | | | > | | | | | | | | | > | | | | | | | | | > +-+----------------+---- +--------------+ +----------+ +---------+ > > But sometims Scsi_Host still keeps the reference to srp_host that is > possible released already. This could be happend if i frequently abort > (Ctrl-c) the blktests during it was running and then cause below error: > > [ 952.299153] Freed by task 17289: > [ 952.299156] kasan_save_stack+0x1e/0x40 > [ 952.299160] kasan_set_track+0x21/0x30 > [ 952.299164] kasan_set_free_info+0x20/0x30 > [ 952.299169] __kasan_slab_free+0x108/0x170 > [ 952.299173] kfree+0x9a/0x320 > [ 952.299177] srp_remove_one+0x114/0x180 [ib_srp] > [ 952.299189] remove_client_context+0x8f/0xd0 [ib_core] > [ 952.299269] disable_device+0xee/0x1e0 [ib_core] > [ 952.299348] __ib_unregister_device+0x59/0xf0 [ib_core] > [ 952.299429] ib_unregister_device_and_put+0x3b/0x50 [ib_core] > [ 952.299509] nldev_dellink+0x126/0x1b0 [ib_core] > [ 952.299592] rdma_nl_rcv_msg+0x1cc/0x310 [ib_core] > [ 952.299673] rdma_nl_rcv+0x172/0x200 [ib_core] > [ 952.299760] netlink_unicast+0x36b/0x4a0 > [ 952.299770] netlink_sendmsg+0x3a9/0x6d0 > [ 952.299774] sock_sendmsg+0x91/0xa0 > [ 952.299783] __sys_sendto+0x16f/0x210 > [ 952.299788] __x64_sys_sendto+0x6f/0x80 > [ 952.299792] do_syscall_64+0x3b/0x90 > [ 952.299795] entry_SYSCALL_64_after_hwframe+0x46/0xb0 I don't even understand how get_device() prevents this call chain?? It looks to me like the problem is srp_remove_one() is not waiting for or canceling some outstanding work. Jason