On 10/28/12 21:34, Chris Murphy wrote:
Anyway, the idea mdadm users can't benefit from shorter ERC is untrue.
They certainly can. But the open question is why would they be getting
such long error recovery times in the first place? 7 seconds is a long
time. 2 minutes is a WTF moment.
Totally agreed that 7 seconds is a way way too long time already.
Suppose the user is reading a 5MB file in a 5 disks array, that's
(approximately) 1MB from each disk.
Suppose that in one disk you are hitting a bad area where all sectors
are unreadable: that's a 256-sectors sequence of 7 seconds waits, that
means HALF AN HOUR wait!
it's total nonsense
I have tried to set ERC to lower values than 7 on Hitachi drives, and
maybe AFAIR also on WD RE, but none of the two allowed values lower
than, IIRC, 6.0 seconds . Which is strange because the smart command
wants 2 digits, expressed in deciseconds, so I would expect to be able
to set the ERC to 0.1 seconds which is definitely not possible.
Any comments would be appreciated.
What Linux needs to address this feature imho is the ability to
configure the failure actions. If the drive does not respond within 1
second, I want the SCSI command to be aborted, device RESET, bus RESET
or whatever (without the drive dropping out of the controller if
possible), then an error to be returned to MD so that it starts the
sector rewrite and goes on immediately. Do you think this would be
possible or it would puzzle the drive?
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html