Hi Bernard, I'm running into the following problem while running xfstest generic/460 using NFSoRDMA using soft iWarp. Any ideas why? Thank you. ================================================================== [ 1306.370743] BUG: KASAN: use-after-free in down_write+0x78/0x110 [ 1306.372186] Write of size 8 at addr ffff888054beb180 by task kworker/u256:2/3880 [ 1306.373893] [ 1306.374268] CPU: 0 PID: 3880 Comm: kworker/u256:2 Not tainted 6.0.0-rc4+ #118 [ 1306.375905] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 [ 1306.378465] Workqueue: siw_cm_wq siw_cm_work_handler [siw] [ 1306.379872] Call Trace: [ 1306.380497] <TASK> [ 1306.381077] dump_stack_lvl+0x33/0x46 [ 1306.382151] print_report.cold.12+0xb2/0x6b7 [ 1306.383200] ? down_write+0x78/0x110 [ 1306.384060] kasan_report+0xa3/0x120 [ 1306.384914] ? __wake_up_common+0x41/0x240 [ 1306.385879] ? down_write+0x78/0x110 [ 1306.386725] kasan_check_range+0x179/0x1e0 [ 1306.387690] down_write+0x78/0x110 [ 1306.388526] ? down_write_killable+0x120/0x120 [ 1306.389592] siw_qp_llp_close+0x2b/0x250 [siw] [ 1306.391019] siw_cm_work_handler+0x3af/0x1410 [siw] [ 1306.392358] ? set_next_entity+0xb1/0x2b0 [ 1306.393463] ? siw_qp_cm_drop+0x290/0x290 [siw] [ 1306.394680] ? preempt_count_sub+0x14/0xc0 [ 1306.395641] ? _raw_spin_unlock+0x15/0x30 [ 1306.396588] ? finish_task_switch+0xe5/0x3e0 [ 1306.397610] ? __switch_to+0x2fa/0x760 [ 1306.398507] ? __schedule+0x575/0xf50 [ 1306.399377] process_one_work+0x3b1/0x6b0 [ 1306.400342] worker_thread+0x5a/0x640 [ 1306.401209] ? __kthread_parkme+0xae/0xd0 [ 1306.402167] ? process_one_work+0x6b0/0x6b0 [ 1306.403306] kthread+0x160/0x190 [ 1306.404085] ? kthread_complete_and_exit+0x20/0x20 [ 1306.405252] ret_from_fork+0x1f/0x30 [ 1306.406160] </TASK> [ 1306.406694] [ 1306.407067] Allocated by task 3880: [ 1306.407898] kasan_save_stack+0x1c/0x40 [ 1306.408808] __kasan_kmalloc+0x84/0xa0 [ 1306.409693] __kmalloc+0x186/0x2f0 [ 1306.410504] create_qp.part.23+0x2e8/0x330 [ib_core] [ 1306.412062] ib_create_qp_kernel+0x75/0x160 [ib_core] [ 1306.413672] rdma_create_qp+0x101/0x240 [rdma_cm] [ 1306.414911] rpcrdma_xprt_connect+0xa76/0xae0 [rpcrdma] [ 1306.416606] xprt_rdma_connect_worker+0x78/0x180 [rpcrdma] [ 1306.418242] process_one_work+0x3b1/0x6b0 [ 1306.419205] worker_thread+0x5a/0x640 [ 1306.420200] kthread+0x160/0x190 [ 1306.421024] ret_from_fork+0x1f/0x30 [ 1306.422029] [ 1306.422432] Freed by task 3882: [ 1306.423242] kasan_save_stack+0x1c/0x40 [ 1306.424450] kasan_set_track+0x21/0x30 [ 1306.425451] kasan_set_free_info+0x20/0x40 [ 1306.426545] __kasan_slab_free+0x104/0x190 [ 1306.427599] kfree+0x15b/0x360 [ 1306.428597] ib_destroy_qp_user+0x11b/0x210 [ib_core] [ 1306.430235] rdma_destroy_qp+0x34/0xd0 [rdma_cm] [ 1306.431464] rpcrdma_ep_put+0x69/0x180 [rpcrdma] [ 1306.432906] rpcrdma_xprt_disconnect+0x3fb/0x4d0 [rpcrdma] [ 1306.434601] xprt_rdma_close+0xe/0x50 [rpcrdma] [ 1306.436183] xprt_autoclose+0xa0/0x190 [sunrpc] [ 1306.438386] process_one_work+0x3b1/0x6b0 [ 1306.439406] worker_thread+0x5a/0x640 [ 1306.440559] kthread+0x160/0x190 [ 1306.441574] ret_from_fork+0x1f/0x30 [ 1306.442475] [ 1306.442900] The buggy address belongs to the object at ffff888054beb000 [ 1306.442900] which belongs to the cache kmalloc-2k of size 2048 [ 1306.445828] The buggy address is located 384 bytes inside of [ 1306.445828] 2048-byte region [ffff888054beb000, ffff888054beb800) [ 1306.448524] [ 1306.448895] The buggy address belongs to the physical page: [ 1306.450168] page:00000000bf2cbf48 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x54be8 [ 1306.452435] head:00000000bf2cbf48 order:3 compound_mapcount:0 compound_pincount:0 [ 1306.454495] flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff) [ 1306.456354] raw: 000fffffc0010200 ffffea0001593600 dead000000000002 ffff888001042000 [ 1306.458377] raw: 0000000000000000 0000000080080008 00000001ffffffff 0000000000000000 [ 1306.460189] page dumped because: kasan: bad access detected [ 1306.461478] [ 1306.461854] Memory state around the buggy address: [ 1306.463056] ffff888054beb080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 1306.464791] ffff888054beb100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 1306.466471] >ffff888054beb180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 1306.468128] ^ [ 1306.468902] ffff888054beb200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 1306.470611] ffff888054beb280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 1306.472279] ================================================================== I have dynamic debugging turned on but I'm not sure how far before the KASAN output you'd be interested to see. This is what's after: [ 1306.475413] siw0: QP[1] siw_qp_llp_close: llp close: no state transition needed: ERROR [ 1306.477241] siw0: QP[1] siw_qp_llp_close: llp close exit: state ERROR [ 1307.688823] siw0: CEP[0xffff888031ff1600] __siw_cep_dealloc: free endpoint this was before (note NFS layer is detecting that NFS server isn't responding) [ 1301.718666] nfs: server 192.168.1.124 not responding, still trying [ 1305.913263] siw0: CEP[0xffff888031ff1600] siw_cm_llp_state_change: state: 7 [ 1305.915269] siw0: CEP[0xffff888031ff1600] siw_cm_queue_work: [QP 1]: work type: 4, timeout 0 [ 1305.918056] siw0: CEP[0xffff888031ff1600] siw_cm_llp_error_report: error 104, socket state: 7, cep state: 7 [ 1305.918138] siw0: CEP[0xffff888031ff1600] siw_cm_work_handler: [QP 1]: work type: 4, state 7 [ 1305.921167] siw0: QP[1] siw_qp_sq_process: wqe type 9 processing failed: -104 [ 1305.923282] siw0: CEP[0xffff888031ff1600] siw_cm_upcall: [QP 1]: reason=4, status=0 [ 1305.925212] siw0: QP[1] siw_qp_sq_process: proc. read.response failed: -104 [ 1305.925243] siw0: QP[1] siw_sq_resume: SQ task failed: err -104 [ 1305.927612] siw0: CEP[0xffff888031ff1600] siw_cm_upcall: [QP 1]: reason=5, status=0 [ 1305.929894] siw0: QP[1] siw_qp_modify: state: RTS => CLOSING [ 1305.931615] siw0: CEP[0xffff888031ff1600] siw_cm_work_handler: release: timer=n, QP[1] [ 1305.931639] siw0: QP[1] siw_qp_llp_close: enter llp close, state = RTS [ 1305.934168] siw0: CQ[500590320] siw_rqe_complete: Call completion handler [ 1305.934239] siw0: CEP[0xffff888031ff1600] siw_qp_cm_drop: already closed [ 1305.943804] siw0: QP[1] siw_verbs_modify_qp: desired IB QP state: ERR [ 1305.945323] siw0: CQ[500590320] siw_reap_cqe: idx 0, type 8, flags 1, id 0xffff888031246c00 [ 1305.947253] siw0: CQ[500590320] siw_req_notify_cq: flags: 0x06 [ 1305.963279] siw0: QP[1] siw_qp_modify: state: CLOSING => ERROR [ 1305.965195] siw0: CQ[500590320] siw_rqe_complete: Call completion handler [ 1305.966914] siw0: CQ[500590320] siw_reap_cqe: idx 1, type 8, flags 1, id 0xffff88801d38fbd0 [ 1305.969017] siw0: CQ[500590320] siw_req_notify_cq: flags: 0x06 [ 1306.065108] siw0: QP[1] siw_verbs_modify_qp: desired IB QP state: ERR [ 1306.066882] siw0: QP[1] siw_qp_modify: state: ERROR => ERROR [ 1306.068578] siw0: CQ[3814951205] siw_sqe_complete: Call completion handler [ 1306.070415] siw0: CQ[3814951205] siw_reap_cqe: idx 1, type 0, flags 1, id 0xffff88801d38fb68 [ 1306.072506] siw0: CQ[3814951205] siw_req_notify_cq: flags: 0x06 [ 1306.079748] siw0: MEM[0x4b102701] siw_dereg_mr: deregister MR [ 1306.081309] siw0: MEM[0x4b102701] siw_free_mem: free mem, pbl: y [ 1306.082814] siw0: MEM[0xe7728501] siw_dereg_mr: deregister MR [ 1306.084320] siw0: MEM[0xe7728501] siw_free_mem: free mem, pbl: y [ 1306.086048] siw0: MEM[0xa5b4d801] siw_dereg_mr: deregister MR [ 1306.087476] siw0: MEM[0xa5b4d801] siw_free_mem: free mem, pbl: y [ 1306.089028] siw0: MEM[0x6b7ed500] siw_dereg_mr: deregister MR [ 1306.090457] siw0: MEM[0x6b7ed500] siw_free_mem: free mem, pbl: y [ 1306.318414] siw0: QP[1] siw_destroy_qp: state 5 [ 1306.319688] siw0: QP[1] siw_qp_modify: state: ERROR => ERROR [ 1306.321075] siw0: CQ[500590320] siw_destroy_cq: free CQ resources [ 1306.322616] siw0: CQ[3814951205] siw_destroy_cq: free CQ resources [ 1306.324373] siw0: MEM[0x16f39e00] siw_dereg_mr: deregister MR [ 1306.325986] siw0: MEM[0x16f39e00] siw_free_mem: free mem, pbl: n [ 1306.327995] siw0: PD[26] siw_dealloc_pd: free PD