On Sat, 24 Dec 2011 19:24:37 -0500, Phil Turmel <philip@xxxxxxxxxx> wrote: > On 12/24/2011 10:54 AM, Roger Heflin wrote: > > On my Seagates I turned down the SCTERC to really low (ie .2 seconds) > > and from what I could see it did not make an obvious difference in > > the length of the time that the system paused, the pauses appeared to > > stay at about 30 seconds...which I guess implies that the actual read > > failed timeout was being hit rather than the disk returning an error > > in a reasonable time...from the log each time it was forcing a > > re-write it appeared to be 8 sections of 8 sector each so 32k of > > data, 64 sectors. I seem to remember there is a way to turn down > > the disk op timeout...but at least on my system turning it down lower > > would mean that the disks might not have enough time to spinup out of > > a sleep... > > On the drives I've checked closely, any SCTERC setting below 6.5 seconds > is discarded and treated as zero (no limit). Setting timeouts in the > driver stack below the timeout in the drive is counterproductive, as > drives won't abandon the error recovery attempt to reply to the controller's > next command. So the drive gets kicked out of the array as completely > failed (unresponsive) instead of dealing with the localized read > error. Well, that's fair enough, but I'm guessing that it would be relatively cheap to notice the fact that the read took _ages_ to return, and treat that as a failure of sorts, even if the drive eventually claims success. Then, at least the sector would be rewritten, which would either solve the problem by refreshing the data, or provoke the sector to be re-mapped if the physical sector was really damaged. That way you'd not be constantly bumping into the same pending sectors, provoking extended read attempts, and thus degrading the whole system's performance. Alternatively, some way of nudging mdadm into rewriting a sector in one device from wherever it's stored elsewhere in a RAID, could be combined with something looking for read failures in the logs, without needing to add any extra checks to the normal operational code. Cheers, Phil. -- |)| Philip Hands [+44 (0)20 8530 9560] http://www.hands.com/ |-| HANDS.COM Ltd. http://www.uk.debian.org/ |(| 10 Onslow Gardens, South Woodford, London E18 1NE ENGLAND
Attachment:
pgpIhF0iy3CwD.pgp
Description: PGP signature