On 03/21/2017 11:41 AM, David Brown wrote: > There /is/ a bit of correlation for early-fail drives coming from > the same batch. But there is little correlation for normal lifetime > drives. > > If you roll three dice and sum them, the expected sum will follow a > nice Bell curve distribution. If you pick another three dice and > roll them, they will follow the same distribution for the expected > sum. But there is no correlation between the sums. Let me add to this: The correlation is effectively immaterial in a non-degraded raid5 and singly-degraded raid6 because recovery will succeed as long as any two errors are in different 4k block/sector locations. And for non-degraded raid6, all three UREs must occur in the same block/sector to lose data. Some participants in this discussion need to read the statistical description of this stuff here: http://marc.info/?l=linux-raid&m=139050322510249&w=2 As long as you are 'check' scrubbing every so often (I scrub weekly), the odds of catastrophe on raid6 are the odds of something *else* taking out the machine or controller, not the odds of simultaneous drive failures. Phil -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html