On 3/10/21 12:26 AM, Gulam Mohamed wrote: > Description > =========== > 1. This Kernel panic could be due to a timing issue when there is a > race between the sync thread and the initiator was processing of > a login response from the target. The session re-open can be invoked > from two places: > a. Sessions sync thread when the iscsid restart > b. From iscsid through iscsi error handler > 2. The session reopen sequence is as follows in user-space > a. Disconnect the connection > b. Then send the stop connection request to the kernel > which releases the connection (releases the socket) > c. Queues the reopen for 2 seconds delay > d. Once the delay expires, create the TCP connection again by > calling the connect() call > e. Poll for the connection > f. When poll is successful i.e when the TCP connection is > established, it performs: > i. Creation of session and connection data structures > ii. Bind the connection to the session. This is the place > where we assign the sock to tcp_sw_conn->sock > iii. Sets parameters like target name, persistent address > iv. Creates the login pdu > v. Sends the login pdu to kernel > vi. Returns to the main loop to process further events. > The kernel then sends the login request over to the > target node > g. Once login response with success is received, it enters full > feature phase and sets the negotiable parameters like > max_recv_data_length,max_transmit_length, data_digest etc. > 3. While setting the negotiable parameters by calling > "iscsi_session_set_neg_params()", kernel panicked as sock was NULL > > What happened here is > --------------------- > 1. Before initiator received the login response mentioned in above > point 2.f.v, another reopen request was sent from the error > handler/sync session for the same session, as the initiator utils > was in main loop to process further events (as mentioned in point > 2.f.vi above). > 2. While processing this reopen, it stopped the connection which > released the socket and queued this connection and at this point > of time the login response was received for the earlier on > > Fix > --- > 1. Add new connection state ISCSI_CONN_BOUND in "enum iscsi_connection > _state" > 2. Set the connection state value to ISCSI_CONN_DOWN upon > iscsi_if_ep_disconnect() and iscsi_if_stop_conn() > 3. Set the connection state to the newly created value ISCSI_CONN_BOUND > after bind connection (transport->bind_conn()) > 4. In iscsi_set_param, return -ENOTCONN if the connection state is not > either ISCSI_CONN_BOUND or ISCSI_CONN_UP > > Signed-off-by: Gulam Mohamed <gulam.mohamed@xxxxxxxxxx> > --- > drivers/scsi/scsi_transport_iscsi.c | 16 +++++++++++++--- > include/scsi/scsi_transport_iscsi.h | 1 + > 2 files changed, 14 insertions(+), 3 deletions(-) > > diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c > index 91074fd97f64..19375f1ba4b2 100644 > --- a/drivers/scsi/scsi_transport_iscsi.c > +++ b/drivers/scsi/scsi_transport_iscsi.c > @@ -2475,6 +2475,7 @@ static void iscsi_if_stop_conn(struct iscsi_cls_conn *conn, int flag) > */ > mutex_lock(&conn_mutex); > conn->transport->stop_conn(conn, flag); > + conn->state = ISCSI_CONN_DOWN; > mutex_unlock(&conn_mutex); > > } > @@ -2899,8 +2900,13 @@ iscsi_set_param(struct iscsi_transport *transport, struct iscsi_uevent *ev) > session->recovery_tmo = value; > break; > default: > - err = transport->set_param(conn, ev->u.set_param.param, > - data, ev->u.set_param.len); > + if ((conn->state == ISCSI_CONN_BOUND) || > + (conn->state == ISCSI_CONN_UP)) { > + err = transport->set_param(conn, ev->u.set_param.param, > + data, ev->u.set_param.len); > + } > + else The else should go on the line above } else and I'm not 100% sure but some people prefer: } else { return -ENOTCONN; } because the first chunk has {}s so this balances it.