> There are two handlers in question here, the listener cm_id and > the connection cm_id. The connection cma_id should definitely trigger > disconnect and resource cleanup. The question is should the listener > cma_id (which maps to the isert network portal - np) recreate the > cma_id in this event. 1) How connection cm_id and isert_conn are created: [cma.c] - cma_ib_req_handler() - * Create conn_id via cma_ib_new_conn_id(listen_id) - * Calls isert_connect_request(conn_id) [ib_isert.c] - - isert_connect_request(cma_id) - - * isert_conn = kzalloc() - - * isert_conn->cm_id = cma_id 2) Acceptance of the connection request starts after creating the listener cm_id: [ib_isert.c] - isert_setup_id() [ib_isert.c] - - rdma_listen() [cma.c] - - - cma_ib_listen() [cma.c] - - - - ib_cm_insert_listen(cma_ib_req_handler) 3) Processing RDMA_CM_EVENT_ADDR_CHANGE for connection cm_id: [ib_isert.c] - rdma_disconnect(cm_id) [ib_isert.c] - rdma_destroy_id(cm_id) [ib_isert.c] - kfree(iser_conn) 4) Processing RDMA_CM_EVENT_ADDR_CHANGE for listener cm_id: Since iser_conn has been freed, it needs to be recreated. There are 2 options here: a) listener cm_id needs to be recreated, it calls rdma_listen(), which in turn initiates the acceptance of the connection request, after which iser_conn will be created. b) listener cm_id does not need to be recreated, that is, RDMA_CM_EVENT_ADDR_CHANGE is informative for it. I have tested my test case with the following patch that matches point b: diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 636d590765f9578c0ff595cdf74b79400bfa66ed..54f615828961576ffa1f74c8b9781a5cf48512a3 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -601,6 +601,8 @@ static int isert_np_cma_handler(struct isert_np *isert_np, enum rdma_cm_event_type event) { + int ret = -1; + isert_dbg("%s (%d): isert np %p\n", rdma_event_msg(event), event, isert_np); @@ -609,19 +611,14 @@ isert_np_cma_handler(struct isert_np *isert_np, isert_np->cm_id = NULL; break; case RDMA_CM_EVENT_ADDR_CHANGE: - isert_np->cm_id = isert_setup_id(isert_np); - if (IS_ERR(isert_np->cm_id)) { - isert_err("isert np %p setup id failed: %ld\n", - isert_np, PTR_ERR(isert_np->cm_id)); - isert_np->cm_id = NULL; - } + ret = 0; break; default: isert_err("isert np %p Unexpected event %d\n", isert_np, event); } - return -1; + return ret; } static int As a result, the listener cm_id remained, the connection request did not come, isert_conn was not recreated. On the initiator, the load dropped to 0 and then ended. With my patch that matches point a, the listener cm_id was recreated -> connection request was received -> isert_conn was created. Based on the above, I can conclude that the RDMA_CM_EVENT_ADDR_CHANGE event is not informative for the listener cm_id. Best regards, Chesnokov Gleb