Re: proactive disk replacement

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/21/2017 11:41 AM, David Brown wrote:

> There /is/ a bit of correlation for early-fail drives coming from
> the same batch.  But there is little correlation for normal lifetime
> drives.
> 
> If you roll three dice and sum them, the expected sum will follow a
> nice Bell curve distribution.  If you pick another three dice and
> roll them, they will follow the same distribution for the expected
> sum.  But there is no correlation between the sums.

Let me add to this:

The correlation is effectively immaterial in a non-degraded raid5 and
singly-degraded raid6 because recovery will succeed as long as any two
errors are in different 4k block/sector locations.  And for non-degraded
raid6, all three UREs must occur in the same block/sector to lose
data. Some participants in this discussion need to read the statistical
description of this stuff here:

http://marc.info/?l=linux-raid&m=139050322510249&w=2

As long as you are 'check' scrubbing every so often (I scrub weekly),
the odds of catastrophe on raid6 are the odds of something *else* taking
out the machine or controller, not the odds of simultaneous drive
failures.

Phil

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux