John Robinson wrote:
On 02/02/2010 22:46, Greg Freemyer wrote:
All,
I think the below is accurate, but please cmiiw or misunderstand.
===
If your using normal big drives (1TB, etc.) in a raid-5 array, the
general consensus of this list is that it is a bad idea. The reason
being
that the sector error rate for a bad sector has not changed with
increasing density.
So in the days of 1GB drives, the likelihood of a undetected /
repaired bad sector was actually pretty low for the drive as whole.
But for today's 1TB drives, the odds are 1000x worse. ie. 1000x more
sectors with the same basic failure rate per sector.
So a raid-5 composed of 1TB drives is 1000x more likely to be unable
to rebuild itself after a drive failure than a raid-5 built from 1 GB
drives of yesteryear. Thus the current recommendation is to use raid
6 with high density drives.
That sounds about right. One might still see RAID-5 as a way of
pushing data loss through bad sectors back into a comfortable zone.
After all, the likelihood of the same sector going bad on one of the
other drives should be relatively small. Unfortunately it's too long
since I studied probability for me to work it out properly. Then, to
also protect yourself against dead drives, adding another drive a la
RAID-6 sounds like the answer. But you can't think of RAID-6
protecting you from 2 drive failures any more.
What is more, you need Linux md's implementation of single-sector
recovery/rewriting for this to work. You cannot go around failing
arrays because occasional single-sector reads fail.
The good news is that Western Digital is apparently introducing a new
series of drives with an error rate "2 orders of magnitude" better
than the current generation.
It's not borne out in their figures; WD quote "less than 1 in 10^15
bits" which is the same as they quote for their older drives.
What sums I've done, on the basis of a 1 in 10^15 bit unrecoverable
error rate, suggest you've a 1 in 63 chance of getting an
uncorrectable error while reading the whole surface of their 2TB disc.
Read the whole disc 44 times and you've a 50/50 chance of hitting an
uncorrectable error.
Rethink that, virtually all errors happen during write, reading is
non-destructive, in terms of what's on the drive. So it's valid after
write or it isn't, but having been written correctly, other than
failures in the media (including mechanical parts) or electronics, the
chances of "going bad" are probably vanishingly small. And since "write
in the wrong place" errors are proportional to actual writes, long term
storage of unchanging data is better than active drives with lots of change.
You could read the whole drive in about 5 hours, according to the spec
(at 110MB/s), so if you keep your drive busy you're going to reach
this point in about 9 days. If you had a 5-drive array, you're going
to get here inside 2 days.
Bear in mind that this is on a drive working perfectly correctly as
specified. We have to expect to be recovering from failed reads daily.
</doom> ;-)
Cheers,
John.
PS. Wish I'd written down my working for this.
PPS. I'm not having a go at WD; other manufacturers' specs are similar.
--
Bill Davidsen <davidsen@xxxxxxx>
"We can't solve today's problems by using the same thinking we
used in creating them." - Einstein
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html