On Thu, Aug 19, 2021 at 10:17:54AM +0800, Wenpeng Liang wrote: > From: Lang Cheng <chenglang@xxxxxxxxxx> > > During the reset, the driver calls dma_pool_destroy() to release the > dma_pool resources. If the dma_pool_free interface is called during the > modify_qp operation, an exception will occur. > > [15834.440744] Unable to handle kernel paging request at virtual address > ffffa2cfc7725678 > ... > [15834.660596] Call trace: > [15834.663033] queued_spin_lock_slowpath+0x224/0x308 > [15834.667802] _raw_spin_lock_irqsave+0x78/0x88 > [15834.672140] dma_pool_free+0x34/0x118 > [15834.675799] hns_roce_free_cmd_mailbox+0x54/0x88 [hns_roce_hw_v2] > [15834.681872] hns_roce_v2_qp_modify.isra.57+0xcc/0x120 [hns_roce_hw_v2] > [15834.688376] hns_roce_v2_modify_qp+0x4d4/0x1ef8 [hns_roce_hw_v2] > [15834.694362] hns_roce_modify_qp+0x214/0x5a8 [hns_roce_hw_v2] > [15834.699996] _ib_modify_qp+0xf0/0x308 > [15834.703642] ib_modify_qp+0x38/0x48 > [15834.707118] rt_ktest_modify_qp+0x14c/0x998 [rdma_test] > ... > [15837.269216] Unable to handle kernel paging request at virtual address > 000197c995a1d1b4 > ... > [15837.480898] Call trace: > [15837.483335] __free_pages+0x28/0x78 > [15837.486807] dma_direct_free_pages+0xa0/0xe8 > [15837.491058] dma_direct_free+0x48/0x60 > [15837.494790] dma_free_attrs+0xa4/0xe8 > [15837.498449] hns_roce_buf_free+0xb0/0x150 [hns_roce_hw_v2] > [15837.503918] mtr_free_bufs.isra.1+0x88/0xc0 [hns_roce_hw_v2] > [15837.509558] hns_roce_mtr_destroy+0x60/0x80 [hns_roce_hw_v2] > [15837.515198] hns_roce_v2_cleanup_eq_table+0x1d0/0x2a0 [hns_roce_hw_v2] > [15837.521701] hns_roce_exit+0x108/0x1e0 [hns_roce_hw_v2] > [15837.526908] __hns_roce_hw_v2_uninit_instance.isra.75+0x70/0xb8 [hns_roce_hw_v2] > [15837.534276] hns_roce_hw_v2_uninit_instance+0x64/0x80 [hns_roce_hw_v2] > [15837.540786] hclge_uninit_client_instance+0xe8/0x1e8 [hclge] > [15837.546419] hnae3_uninit_client_instance+0xc4/0x118 [hnae3] > [15837.552052] hnae3_unregister_client+0x16c/0x1f0 [hnae3] > [15837.557346] hns_roce_hw_v2_exit+0x34/0x50 [hns_roce_hw_v2] > [15837.562895] __arm64_sys_delete_module+0x208/0x268 > [15837.567665] el0_svc_common.constprop.4+0x110/0x200 > [15837.572520] do_el0_svc+0x34/0x98 > [15837.575821] el0_svc+0x14/0x40 > [15837.578862] el0_sync_handler+0xb0/0x2d0 > [15837.582766] el0_sync+0x140/0x180 > > It is caused by two concurrent processes: > uninit_instance->dma_pool_destroy(cmdq) > modify_qp->dma_poll_free(cmdq) Something else has gone wrong in your system. modify_qp is not allowed to be running after ib_unregister_device() returns. I see: [15834.707118] rt_ktest_modify_qp+0x14c/0x998 [rdma_test] Which suggest to me your ULP is a test and that test is not properly acting as an ib_client. When a client is unregistered it must close all RDMA objects and stop all activity before the client unregister callback returns. Jason