Re: Special drives for Linux Raid?

Beolach <beolach@xxxxxxxxx> · Mon, 7 Nov 2011 11:00:55 -0700

On Mon, Nov 7, 2011 at 07:57, David Brown <david@xxxxxxxxxxxxxxx> wrote:
> On 07/11/2011 14:49, Miles Fidelman wrote:
>>
>> Danilo Godec wrote:
>>>
>>> Some manufacturers make 'special' versions of drives for RAID (WD RE4,
>>> Seagate SE, ...). Apparently the main difference is in error handling,
>>> where normal 'desktop' drives try hard to recover an error (up to
>>> several minutes) while RAID drives give up quickly (few seconds) so
>>> that the RAID controller can take over.
>>>
>> not so much "special" as "different"
>>
>> the term to look for is "enterprise"
>>
>> you've identified the key distinction:
>>
>> - desktop drives assume that they have the only copy of your data, the
>> on-board processor tries very hard to read and re-read until it returns
>> your data ---- the result is that everything slows down
>>
>> - if you have a raid array, you want a failing disk to give up and
>> return, very quickly, so that the data can be read from a different drive
>>
>> I learned this the hard way, when I had a server that just slowed way
>> down to the point that it took 10 seconds or more to echo a keystroke.
>> It took me a long time to figure out what was going on - and some rather
>> painful false starts (trashed the o/s).
>>
>> One important thing I discovered: the md RAID driver does NOT consider a
>> long time delay as a signal to fail a drive out of an array. It's a
>> really good idea to run mdstat and keep an eye on your drives. If Raw
>> Reed Error goes above 0, start paying attention.
>>
>
> As far as I know (and I hope I'll be corrected quickly if I'm wrong), when a
> drive fails to read from a sector, it will be considered a "failed" drive by
> the raid controller or software raid, and kicked out of the array.  The
> exception is the latest versions of md raid which support bad block lists.
>

I don't think that's quite correct - when a member drive of an MD RAID
returns a read error, MD tries to re-write the sector using the
redundancy from the other drives in the RAID.  It's only if a drive
returns a *write* error that the drive is failed.

-- 
Conway S. Smith
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html