Re: [RFC] Prevent infinite retries due to DID_RESET return status

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 11, 2006 at 03:42:34PM -0600, Michael Reed wrote:
> Due to a firmware mismatch between a host and target (names withheld to
> protect the innocent?), the LLDD was returning DID_RESET for every
> i/o command.  This patch modifies the scsi layer to take into account
> when the command which received DID_RESET was issued and eventually
> give up on it instead of unconditionally reissuing it forever
> when it receives a DID_RESET.  With this patch, on my test system,
> the command receiving the constant DID_RESET times out after about
> 360 seconds.
> 
> The premise for this patch is that no command should have an infinite
> lifetime.  The impetus for this patch was a system which would not
> reach a command prompt without disconnecting the storage from the
> host.
> 
> The significant change in this patch is to call scsi_retry_command()
> instead of scsi_requeue_command() if the command which receives a
> DID_RESET did not complete any i/o (good_bytes==0).  scsi_retry_command()
> does not release the command and regenerate it like scsi_requeue_command()
> does, hence jiffies_at_alloc reflects when the command was first issued.

Generally this patch looks good to me.  Some comments:


> -extern int scsi_retry_command(struct scsi_cmnd *cmd);
> +extern int scsi_retry_command(struct scsi_cmnd *cmd, int reason);

	I've just sent a patch to kill scsi_retry_command.  Use
	scsi_queue_insert directly instead.

> +	scsi_release_buffers(cmd);

	Can you please separate out the moving of the scsi_release_buffer
	calls into a separate patch so it can be audited better?

> @@ -961,9 +992,20 @@ void scsi_io_completion(struct scsi_cmnd
>  		/* Third party bus reset or reset for error recovery
>  		 * reasons.  Just retry the request and see what
>  		 * happens.
> +		 * If no data was transferred, just reissue this
> +		 * command.  If data was transferred, regenerate
> +		 * the command to transfer only untransferred data.
>  		 */

The whole comment should look more like:

		/*
		 * Third party bus reset or reset for error recovery reasons.
		 * If no data was transferred, just reissue this command.
		 * If data was transferred, regenerate the command to transfer
		 * only untransferred data.
		 */

> -		scsi_requeue_command(q, cmd);
> -		return;
> +		if (!good_bytes) {
> +			if (!(scsi_command_expired(cmd))) {
> +				scsi_retry_command(cmd, SCSI_MLQUEUE_DID_RESET);
> +				return;
> +			}
> +		}
> +		else {
> +			scsi_requeue_command(q, cmd);
> +			return;
> +		}

	With this code we now fallthrough if we don't have any good bytes
	and the command has expired.  Is this the expected behaviour? If
	yes we need a good comment describing it.

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux