On Mon, Nov 7, 2011 at 07:57, David Brown <david@xxxxxxxxxxxxxxx> wrote: > On 07/11/2011 14:49, Miles Fidelman wrote: >> >> Danilo Godec wrote: >>> >>> Some manufacturers make 'special' versions of drives for RAID (WD RE4, >>> Seagate SE, ...). Apparently the main difference is in error handling, >>> where normal 'desktop' drives try hard to recover an error (up to >>> several minutes) while RAID drives give up quickly (few seconds) so >>> that the RAID controller can take over. >>> >> not so much "special" as "different" >> >> the term to look for is "enterprise" >> >> you've identified the key distinction: >> >> - desktop drives assume that they have the only copy of your data, the >> on-board processor tries very hard to read and re-read until it returns >> your data ---- the result is that everything slows down >> >> - if you have a raid array, you want a failing disk to give up and >> return, very quickly, so that the data can be read from a different drive >> >> I learned this the hard way, when I had a server that just slowed way >> down to the point that it took 10 seconds or more to echo a keystroke. >> It took me a long time to figure out what was going on - and some rather >> painful false starts (trashed the o/s). >> >> One important thing I discovered: the md RAID driver does NOT consider a >> long time delay as a signal to fail a drive out of an array. It's a >> really good idea to run mdstat and keep an eye on your drives. If Raw >> Reed Error goes above 0, start paying attention. >> > > As far as I know (and I hope I'll be corrected quickly if I'm wrong), when a > drive fails to read from a sector, it will be considered a "failed" drive by > the raid controller or software raid, and kicked out of the array. The > exception is the latest versions of md raid which support bad block lists. > I don't think that's quite correct - when a member drive of an MD RAID returns a read error, MD tries to re-write the sector using the redundancy from the other drives in the RAID. It's only if a drive returns a *write* error that the drive is failed. -- Conway S. Smith -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html