Re: iSCSI Abort Task and WRITE PENDING

Konstantin Shelekhin <k.shelekhin@xxxxxxxxx> · Fri, 15 Oct 2021 02:12:34 +0300

On Wed, Oct 13, 2021 at 03:21:19PM -0500, Mike Christie wrote:
> For lun resets, let's take the abort case.
> 
> 1. core_tmr_abort_task is waiting in target_put_cmd_and_wait for the WRITE.
> 2. For the normal case where a cmd has been passed to LIO core then eventually
> the cmd completes and we do:
> 
> target_complete_cmd -> target_complete_cmd_with_sense -> target_cmd_interrupted
> -> target_handle_abort. Ignoring TAS, that function then does the last puts on the
> cmd which wakes up the TMR code in #1.
> 
> For the case where we never submit the cmd to LIO core, then the puts will never
> be done by that target_complete_cmd path. And, in this case where the iscsi connection
> is closed we know the cmd will never be submitted to LIO core because the iscsi
> threads have been killed. iscsi needs to do the last puts on the cmd so #1 wakes up
> on that WRITE.
> 
> So we need a cmd bit to set to indicate iscsi has called one of the submission
> /execution functions in LIO. If it's not set then iscsit_release_commands_from_conn/
> iscsit_free_cmd needs to do those last puts.
> 
> Here in this example to better show what I mean, we detect if an abort has been sent
> to LIO core for a cmd that has not been sent to LIO core. If that happens then
> iscsit_free_cmd gets force_cleanup=true so transport_generic_free_cmd will do a put
> and not wait, and then iscsit_free_cmd does the last target_put_sess_cmd which
> will wake #1 above. The abort will then complete too.
> 
> Note:
> I don't think we have to worry about TAS in this case, because we never got a complete
> SCSI command. The iscsi layer only got part of it and never submitted it to the SCSI
> layer. We then escalated to session level recovery so we are not sending any responses
> for any of the cmds or TMFs. The transport is killed at this point, so no responses
> can even be sent.
> 
> 
> diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
> index 2c54c5d8412d..d0e80a2b653b 100644
> --- a/drivers/target/iscsi/iscsi_target.c
> +++ b/drivers/target/iscsi/iscsi_target.c
> @@ -4088,7 +4088,8 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
>  
>  		if (se_cmd->se_tfo != NULL) {
>  			spin_lock_irq(&se_cmd->t_state_lock);
> -			if (se_cmd->transport_state & CMD_T_ABORTED) {
> +			if (se_cmd->transport_state & CMD_T_ABORTED &&
> +			    se_cmd->transport_state & CMD_T_SUBMITTED) {
>  				/*
>  				 * LIO's abort path owns the cleanup for this,
>  				 * so put it back on the list and let
> @@ -4096,7 +4097,7 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
>  				 */
>  				list_move_tail(&cmd->i_conn_node,
>  					       &conn->conn_cmd_list);
> -			} else {
> +			} else if (!(se_cmd->transport_state & CMD_T_ABORTED)) {
>  				se_cmd->transport_state |= CMD_T_FABRIC_STOP;
>  			}
>  			spin_unlock_irq(&se_cmd->t_state_lock);
> @@ -4108,8 +4109,12 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn)
>  		list_del_init(&cmd->i_conn_node);
>  
>  		iscsit_increment_maxcmdsn(cmd, sess);
> -		iscsit_free_cmd(cmd, true);
>  
> +		if (se_cmd->transport_state & CMD_T_ABORTED &&
> +		    !(se_cmd->transport_state & CMD_T_SUBMITTED))
> +			iscsit_free_cmd(cmd, false, true);
> +		else
> +			iscsit_free_cmd(cmd, true, false);
>  	}
>  }
>  
> diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
> index 6dd5810e2af1..931586595044 100644
> --- a/drivers/target/iscsi/iscsi_target_util.c
> +++ b/drivers/target/iscsi/iscsi_target_util.c
> @@ -742,7 +742,7 @@ void __iscsit_free_cmd(struct iscsi_cmd *cmd, bool check_queues)
>  		conn->conn_transport->iscsit_unmap_cmd(conn, cmd);
>  }
>  
> -void iscsit_free_cmd(struct iscsi_cmd *cmd, bool shutdown)
> +void iscsit_free_cmd(struct iscsi_cmd *cmd, bool shutdown, bool force_cleanup)
>  {
>  	struct se_cmd *se_cmd = cmd->se_cmd.se_tfo ? &cmd->se_cmd : NULL;
>  	int rc;
> @@ -751,10 +751,14 @@ void iscsit_free_cmd(struct iscsi_cmd *cmd, bool shutdown)
>  
>  	__iscsit_free_cmd(cmd, shutdown);
>  	if (se_cmd) {
> -		rc = transport_generic_free_cmd(se_cmd, shutdown);
> +		rc = transport_generic_free_cmd(se_cmd,
> +					force_cleanup ? false : shutdown);
>  		if (!rc && shutdown && se_cmd->se_sess) {
>  			__iscsit_free_cmd(cmd, shutdown);
>  			target_put_sess_cmd(se_cmd);
> +		} else if (se_cmd->sess && force_cleanup) {
> +			__iscsit_free_cmd(cmd, true);
> +			target_put_sess_cmd(se_cmd);
>  		}
>  	} else {
>  		iscsit_release_cmd(cmd);
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index 14c6f2bb1b01..eb233ea8db65 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -1554,7 +1554,7 @@ int transport_handle_cdb_direct(
>  	 * this to be called for initial descriptor submission.
>  	 */
>  	cmd->t_state = TRANSPORT_NEW_CMD;
> -	cmd->transport_state |= CMD_T_ACTIVE;
> +	cmd->transport_state |= (CMD_T_ACTIVE | CMD_T_SUBMITTED);
>  
>  	/*
>  	 * transport_generic_new_cmd() is already handling QUEUE_FULL,
> @@ -2221,7 +2221,7 @@ void target_execute_cmd(struct se_cmd *cmd)
>  		return;
>  
>  	spin_lock_irq(&cmd->t_state_lock);
> -	cmd->t_state = TRANSPORT_PROCESSING;
> +	cmd->t_state = (TRANSPORT_PROCESSING | CMD_T_SUBMITTED);
>  	cmd->transport_state |= CMD_T_ACTIVE | CMD_T_SENT;
>  	spin_unlock_irq(&cmd->t_state_lock);
>  
> diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h
> index fb11c7693b25..b759ec810fa9 100644
> --- a/include/target/target_core_base.h
> +++ b/include/target/target_core_base.h
> @@ -511,6 +511,7 @@ struct se_cmd {
>  #define CMD_T_COMPLETE		(1 << 2)
>  #define CMD_T_SENT		(1 << 4)
>  #define CMD_T_STOP		(1 << 5)
> +#define CMD_T_SUBMITTED		(1 << 6)
>  #define CMD_T_TAS		(1 << 10)
>  #define CMD_T_FABRIC_STOP	(1 << 11)
>  	spinlock_t		t_state_lock;
> 
> 
> 
> 
> 

If I understand this aproach correctly, it fixes the deadlock, but the
connection reinstatement will still happen, because WRITE_10 won't be
aborted and the connection will go down after the timeout.

IMO it's not ideal either, since now iSCSI will have a 50% chance to
have the connection (meaning SCSI session) killed on arbitrary ABORT
TASK. While I'm sure most initiators will be able to recover from this
event, such drastic measures will certanly cause a lot of confusion for
people who are not familiar with TCM internals.