Re: Disk I/O error while rebuilding an md raid-5 array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/02/2010 00:04, Greg Freemyer wrote:
PS, if in the end I have to build a new array, I'll probably go with a
raid 6 instead.

Agreed, someone recently posted that for a raid-5 composed of 1TB
drives the odds of a rebuild failure are 1 in 67 even if the remaining
drives are within spec.  (ie. the unrecoverable bit error rate is
slowing succumbing to the ever increasing size of drives.)

Actually the odds were 1 in 67 of an unrecoverable read error while reading 2TB of data, if the odds were 1 in 10^15 per bit read[1], which was the worst-case spec offered by Western Digital. Others disagreed with my analysis, and I may be wrong.

This was nothing to do with RAID, but my suggestion followed on that RAID-5 was now only useful for defending against unrecoverable errors, and not dead drives, and if you wanted to defend against dead drives as well you need RAID-6.

You have 500GB drives, but you have 3 left to rebuild from, so that's
1.5 TB your trying to read.  I'm not sure how the original calculation
was done, so your odds of failed rebuild were either 1 in 134 or about
1 in 42.  Either not very good for something that is supposed to
protect your data.

Actually there are 4 to read from - the original sde is still available. This would be a situation where I think having the hot-rebuild facility recently discussed on this list would be ideal, as if you can't read the data from the drive you're hot-replacing, you then get a second chance to read it from the rest of the drives using the parity information, and the odds of an unrecoverable read error at the same LBA on two drives is smaller - but I can't remember that bit of the probability course I did years ago to work out exactly what it is.

Cheers,

John.

[1] If the probability of an error while reading 1 bit is p, then the probability of an error while reading n bits is 1-(1-p)^n. In this case p=1E-15, n=1.6E13 and you need a scientific calculator to do the sum.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux