Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR

Mark Lord <liml@xxxxxx> · Wed, 31 Jan 2007 13:37:56 -0500

James Bottomley wrote:
On Wed, 2007-01-31 at 12:57 -0500, Mark Lord wrote:
Alan wrote:
When libata reports a MEDIUM_ERROR to us, we *know* it's non-recoverable,
as the drive itself has already done internal retries (libata uses the
"with retry" ATA opcodes for this).
This depends on the firmware. Some of the "raid firmware" drives don't
appear to do retries in firmware.
One way to tell if this is true, is simply to time how long
the failed operation takes.  If the drive truly does not do retries,
then the media error should be reported more or less instantly
(assuming drive was already spun up).

Well, the simpler way (and one we have a hope of implementing) is to
examine the ASC/ASCQ codes to see if the error is genuinely unretryable.

My suggestion above was not for a kernel fix,
but rather just as a way of determining if drives
which claim "no retries" actually do them or not.  :)

I seem to have dropped the ball on this one in that the scsi_error.c
pieces of this patch

http://marc.theaimsgroup.com/?l=linux-scsi&m=116485834119885

I thought I'd applied.  Apparently I didn't, so I'll go back and put
them in.

Good.  That would be a useful supplement to the patch I posted here.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html