Re: [PATCH] scsi_error: do not allow IO errors with certain ILLEGAL_REQUEST sense to be retryable

James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> · Fri, 02 Dec 2011 15:04:51 -0600

On Fri, 2011-12-02 at 15:31 -0500, Mike Snitzer wrote:
> Thin provisioned LUNs from multiple array vendors have failed WRITE SAME
> (16) w/ UNMAP bit set with ILLEGAL_REQUEST sense.  With additional sense
> 0x24 and 0x26 respectively.
> 
> In both instances the target would always fail the CDB no matter how
> many retries were performed (permanent target failure rather than
> transient path failure).  This resulted in mkfs.ext4's discard of a
> multipath device looping indefinitely while failing paths.

I don't quite understand this analysis.  ILLEGAL_REQUEST currently
always returns SUCCESS from scsi_check_sense().  That return is
propagated up to scsi_decide_disposition() which causes I/O completion.
We do have another gate for ILLEGAL_REQUEST in scsi_io_completion()
which can retry, but only if it's downshifting the command from _10 to
_6 ... so I don't get where you think the looping is coming from ... the
net effect of your patch is to change the error passed on to the block
layer in blk_end_request() from -EIO to -EREMOTEIO.  So it sounds like
if there is a retry problem it's above SCSI?

James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html