Re: If your using large Sata drives in raid 5/6 ....

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



John Robinson wrote:
On 02/02/2010 22:46, Greg Freemyer wrote:
All,

I think the below is accurate, but please cmiiw or misunderstand.

===
If your using normal big drives (1TB, etc.) in a raid-5 array, the
general consensus of this list is that it is a bad idea. The reason being
that the sector error rate for a bad sector has not changed with
increasing density.

So in the days of 1GB drives, the likelihood of a undetected /
repaired bad sector was actually pretty low for the drive as whole.
But for today's 1TB drives, the odds are 1000x worse.  ie. 1000x more
sectors with the same basic failure rate per sector.

So a raid-5 composed of 1TB drives is 1000x more likely to be unable
to rebuild itself after a drive failure than a raid-5 built from 1 GB
drives of yesteryear.  Thus the current recommendation is to use raid
6 with high density drives.

That sounds about right. One might still see RAID-5 as a way of pushing data loss through bad sectors back into a comfortable zone. After all, the likelihood of the same sector going bad on one of the other drives should be relatively small. Unfortunately it's too long since I studied probability for me to work it out properly. Then, to also protect yourself against dead drives, adding another drive a la RAID-6 sounds like the answer. But you can't think of RAID-6 protecting you from 2 drive failures any more.

What is more, you need Linux md's implementation of single-sector recovery/rewriting for this to work. You cannot go around failing arrays because occasional single-sector reads fail.

The good news is that Western Digital is apparently introducing a new
series of drives with an error rate "2 orders of magnitude" better
than the current generation.

It's not borne out in their figures; WD quote "less than 1 in 10^15 bits" which is the same as they quote for their older drives.

What sums I've done, on the basis of a 1 in 10^15 bit unrecoverable error rate, suggest you've a 1 in 63 chance of getting an uncorrectable error while reading the whole surface of their 2TB disc. Read the whole disc 44 times and you've a 50/50 chance of hitting an uncorrectable error.

Rethink that, virtually all errors happen during write, reading is non-destructive, in terms of what's on the drive. So it's valid after write or it isn't, but having been written correctly, other than failures in the media (including mechanical parts) or electronics, the chances of "going bad" are probably vanishingly small. And since "write in the wrong place" errors are proportional to actual writes, long term storage of unchanging data is better than active drives with lots of change.

You could read the whole drive in about 5 hours, according to the spec (at 110MB/s), so if you keep your drive busy you're going to reach this point in about 9 days. If you had a 5-drive array, you're going to get here inside 2 days.

Bear in mind that this is on a drive working perfectly correctly as specified. We have to expect to be recovering from failed reads daily.

</doom> ;-)

Cheers,

John.

PS. Wish I'd written down my working for this.
PPS. I'm not having a go at WD; other manufacturers' specs are similar.


--
Bill Davidsen <davidsen@xxxxxxx>
 "We can't solve today's problems by using the same thinking we
  used in creating them." - Einstein

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux