On 3/22/21 12:22 PM, Mike Christie wrote: > On 3/10/21 12:26 AM, Gulam Mohamed wrote: >> Description >> =========== >> 1. This Kernel panic could be due to a timing issue when there is a >> race between the sync thread and the initiator was processing of >> a login response from the target. The session re-open can be invoked >> from two places: >> a. Sessions sync thread when the iscsid restart >> b. From iscsid through iscsi error handler >> 2. The session reopen sequence is as follows in user-space >> a. Disconnect the connection >> b. Then send the stop connection request to the kernel >> which releases the connection (releases the socket) >> c. Queues the reopen for 2 seconds delay >> d. Once the delay expires, create the TCP connection again by >> calling the connect() call >> e. Poll for the connection >> f. When poll is successful i.e when the TCP connection is >> established, it performs: >> i. Creation of session and connection data structures >> ii. Bind the connection to the session. This is the place >> where we assign the sock to tcp_sw_conn->sock >> iii. Sets parameters like target name, persistent address >> iv. Creates the login pdu >> v. Sends the login pdu to kernel >> vi. Returns to the main loop to process further events. >> The kernel then sends the login request over to the >> target node >> g. Once login response with success is received, it enters full >> feature phase and sets the negotiable parameters like >> max_recv_data_length,max_transmit_length, data_digest etc. >> 3. While setting the negotiable parameters by calling >> "iscsi_session_set_neg_params()", kernel panicked as sock was NULL >> >> What happened here is >> --------------------- >> 1. Before initiator received the login response mentioned in above >> point 2.f.v, another reopen request was sent from the error >> handler/sync session for the same session, as the initiator utils >> was in main loop to process further events (as mentioned in point >> 2.f.vi above). >> 2. While processing this reopen, it stopped the connection which >> released the socket and queued this connection and at this point >> of time the login response was received for the earlier on >> >> Fix >> --- >> 1. Add new connection state ISCSI_CONN_BOUND in "enum iscsi_connection >> _state" >> 2. Set the connection state value to ISCSI_CONN_DOWN upon >> iscsi_if_ep_disconnect() and iscsi_if_stop_conn() >> 3. Set the connection state to the newly created value ISCSI_CONN_BOUND >> after bind connection (transport->bind_conn()) >> 4. In iscsi_set_param, return -ENOTCONN if the connection state is not >> either ISCSI_CONN_BOUND or ISCSI_CONN_UP >> >> Signed-off-by: Gulam Mohamed <gulam.mohamed@xxxxxxxxxx> >> --- >> drivers/scsi/scsi_transport_iscsi.c | 16 +++++++++++++--- >> include/scsi/scsi_transport_iscsi.h | 1 + >> 2 files changed, 14 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c >> index 91074fd97f64..19375f1ba4b2 100644 >> --- a/drivers/scsi/scsi_transport_iscsi.c >> +++ b/drivers/scsi/scsi_transport_iscsi.c >> @@ -2475,6 +2475,7 @@ static void iscsi_if_stop_conn(struct iscsi_cls_conn *conn, int flag) >> */ >> mutex_lock(&conn_mutex); >> conn->transport->stop_conn(conn, flag); >> + conn->state = ISCSI_CONN_DOWN; >> mutex_unlock(&conn_mutex); >> >> } >> @@ -2899,8 +2900,13 @@ iscsi_set_param(struct iscsi_transport *transport, struct iscsi_uevent *ev) >> session->recovery_tmo = value; >> break; >> default: >> - err = transport->set_param(conn, ev->u.set_param.param, >> - data, ev->u.set_param.len); >> + if ((conn->state == ISCSI_CONN_BOUND) || >> + (conn->state == ISCSI_CONN_UP)) { >> + err = transport->set_param(conn, ev->u.set_param.param, >> + data, ev->u.set_param.len); >> + } >> + else > > The else should go on the line above > > } else > > and I'm not 100% sure but some people prefer: > > } else { > return -ENOTCONN; > } > > because the first chunk has {}s so this balances it. > I actually prefer brackets all the time :O, so I'm good with that too.