> > Hey Sean, > > Am I correct here? IE: Is it ok for the rdma application to rdma_reject() and > rmda_destroy_id() the CONNECT_REQUEST cm_id _inside_ its event handler as > long > as it returns 0? > > Thanks, > > Steve. Looking at rdma_destroy_id(), I think it is invalid to call it from the event handler: void rdma_destroy_id(struct rdma_cm_id *id) { <snip> /* * Wait for any active callback to finish. New callbacks will find * the id_priv state set to destroying and abort. */ mutex_lock(&id_priv->handler_mutex); mutex_unlock(&id_priv->handler_mutex); And indeed when I tried to destroy the CONNECT request cm_id in the nvmet event handler, I see the event handler thread is stuck: INFO: task kworker/u32:0:6275 blocked for more than 120 seconds. Tainted: G E 4.7.0-rc2-nvmf-all.3+ #81 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u32:0 D ffff880f90737768 0 6275 2 0x10000080 Workqueue: iw_cm_wq cm_work_handler [iw_cm] ffff880f90737768 ffff880f907376d8 ffffffff81c0b500 0000000000000005 ffff8810226a4940 ffff88102b894490 ffffffffa02cf4cd ffff880f00000000 ffff880fcd917c00 ffff880f00000000 0000000000000004 ffff880f00000000 Call Trace: [<ffffffffa02cf4cd>] ? stop_ep_timer+0x2d/0xe0 [iw_cxgb4] [<ffffffff8163e6a7>] schedule+0x47/0xc0 [<ffffffffa024d276>] ? iw_cm_reject+0x96/0xe0 [iw_cm] [<ffffffff8163e8e5>] schedule_preempt_disabled+0x15/0x20 [<ffffffff8163fd78>] __mutex_lock_slowpath+0x108/0x310 [<ffffffff8163ffb1>] mutex_lock+0x31/0x50 [<ffffffffa0261498>] rdma_destroy_id+0x38/0x200 [rdma_cm] [<ffffffffa03145f0>] ? nvmet_rdma_queue_connect+0x1a0/0x1a0 [nvmet_rdma] [<ffffffffa0262fe1>] ? rdma_create_id+0x171/0x1a0 [rdma_cm] [<ffffffffa03146f8>] nvmet_rdma_cm_handler+0x108/0x168 [nvmet_rdma] [<ffffffffa026407a>] iw_conn_req_handler+0x1ca/0x240 [rdma_cm] [<ffffffffa024efc6>] cm_conn_req_handler+0x606/0x680 [iw_cm] [<ffffffffa024f109>] process_event+0xc9/0xf0 [iw_cm] [<ffffffffa024f277>] cm_work_handler+0x147/0x1c0 [iw_cm] [<ffffffff8107d4f6>] ? trace_event_raw_event_workqueue_execute_start+0x66/0xa0 [<ffffffff81081736>] process_one_work+0x1c6/0x550 ... So I withdraw my comment about nvmet. I think the code is fine as-is. The 2nd reject results in a no-op since the connection request was rejected by nvmet. Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html