Re: [PATCH v2 09/13] scsi: target: iscsit: Fix isert disconnect handling during login

Dmitry Bogdanov <d.bogdanov@xxxxxxxxx> · Fri, 13 Jan 2023 17:08:16 +0300

On Wed, Jan 11, 2023 at 09:08:28PM -0600, Mike Christie wrote:
> 
> If we get a disconnect event while logging in we can end up in a state
> where will never be able to relogin. This happens when:
> 
> 1. login thread has put us into TARG_CONN_STATE_IN_LOGIN
> 2. isert then does
> 
> isert_disconnected_handler -> iscsit_cause_connection_reinstatement
> 
> which sets the conn connection_reinstatement flag. Nothing else happens
> because we are only in IN_LOGIN. The tx/rx threads are not running yet
> so we can't start recovery from those contexts at this time.
> 
> 3. The login thread finishes processing the login pdu and thinks login is
> done. It sets us into TARG_CONN_STATE_LOGGED_IN/TARG_SESS_STATE_LOGGED_IN.
> This starts the rx/tx threads.
> 
> 4. The initiator thought it disconnected the connection at 2, and has
> since sent a new connect which is now handled. This leads us to eventually
> run:
> 
> iscsi_check_for_session_reinstatement-> iscsit_stop_session ->
> iscsit_cause_connection_reinstatement
> 
> iscsit_stop_session sees the old conn and does
> iscsit_cause_connection_reinstatement which sees connection_reinstatement
> is set so it just returns instead of trying to kill the tx/rx threads
> which would have caused recovery to start.
> 
> 5. iscsit_stop_session then waits on session_wait_comp which will never
> happen since we didn't kill the tx/rx threads.
> 
> This has the iscsit login code check if a fabric driver ran
> iscsit_cause_connection_reinstatement during the login process similar
> to how we check for the sk state for tcp based iscsit. This will prevent
> us from getting into 3 and creating a ghost session.
> 
> Signed-off-by: Mike Christie <michael.christie@xxxxxxxxxx>
> ---
>  drivers/target/iscsi/iscsi_target_nego.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/target/iscsi/iscsi_target_nego.c b/drivers/target/iscsi/iscsi_target_nego.c
> index ff49c8f3fe24..2dd81752d4c9 100644
> --- a/drivers/target/iscsi/iscsi_target_nego.c
> +++ b/drivers/target/iscsi/iscsi_target_nego.c
> @@ -350,6 +350,16 @@ static int iscsi_target_do_tx_login_io(struct iscsit_conn *conn, struct iscsi_lo
>                                             ISCSI_LOGIN_STATUS_NO_RESOURCES);
>                         return -1;
>                 }
> +
> +               /*
> +                * isert doesn't know the iscsit state and uses
> +                * iscsit_cause_connection_reinstatement as a generic error
> +                * notification system. It may call it before we are in FFP.
> +                * Handle this now in case it signaled a failure before the
> +                * rx/tx threads were up and could start recovery.
> +                */
> +               if (atomic_read(&conn->connection_reinstatement))
> +                       goto err;

Why only for login->login_complete case? In other case the session will
not hang? Will it be droppped on login timeout or something else?

May be the root cause is point 2 itself - calling iscsit_cause_connection_reinstatement
in not ISER_CONN_FULL_FEATURE state where there are no TX_RX threads?
I mean that was a misuse of iscsit_cause_connection_reinstatement?

>         }
> 
>         if (conn->conn_transport->iscsit_put_login_tx(conn, login,
> --
> 2.31.1
> 
>