There is an error with the medium access timeout feature of the sd driver. The sdkp->medium_access_timed_out value is set to zero in sd_done() in the wrong place. It is set to zero only if a command returns sense data. If an I/O command times out, error handling succeeds, and the I/O command completes, the value won't be reset if nothing generates a sense response. Then, another timeout (no matter how far in the future) can increment it again, causing the device to be prematurely set offline. The resetting of sdkp->medium_access_timed_out should occur before the check for sense data. Signed-off-by: David Jeffery <djeffery@xxxxxxxxxx> --- To reproduce using scsi_debug, use SCSI_DEBUG_OPT_TIMEOUT or SCSI_DEBUG_OPT_MAC_TIMEOUT to force an I/O command to timeout. Then, remove the opt value so the I/O will succeed on retry. Perform more I/O as desired. Finally, repeat the process to make a new I/O command time out. Without the patch, the device will be marked offline even though many I/O commands have succeeded between the 2 instances of timed out commands. diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 86fcf2c..2779e6b 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -1669,12 +1669,12 @@ static int sd_done(struct scsi_cmnd *SCpnt) sshdr.ascq)); } #endif + sdkp->medium_access_timed_out = 0; + if (driver_byte(result) != DRIVER_SENSE && (!sense_valid || sense_deferred)) goto out; - sdkp->medium_access_timed_out = 0; - switch (sshdr.sense_key) { case HARDWARE_ERROR: case MEDIUM_ERROR: -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html