2017-03-21 14:02 GMT+01:00 David Brown <david.brown@xxxxxxxxxxxx>: > Note that to cause failure in non-degraded RAID5 (or degraded RAID6), > your two URE's need to be on the same stripe in order to cause data > loss. The chances of getting an URE somewhere on the disk are roughly > proportional to the size of the disk - but the chance of getting an URE > on the same stripe as another URE on another disk are basically > independent of the disk size, and it is extraordinarily small. Little bit OT: is this the same even for HW RAID Controllers like LSI Megaraid or they tend to fail the rebuild in case of multiple URE even in different stripes? > No, you cannot. Your conclusion here is based on several totally > incorrect assumptions: > > 1. You think that RAID5/RAID6 recovery is more stressful, because the > parity is "all over the place". This is wrong. > > 2. You think that random IO has higher chance of getting an URE than > linear IO. This is wrong. Totally agree. > 3. You think that getting an URE on one disk, then getting an URE on a > second disk, counts as a double failure that will break an single-parity > redundancy (RAID5, RAID1, RAID6 in degraded mode). This is wrong - it > is only a problem if the two UREs are in the same stripe, which is quite > literally a one in a million chance. I'm not sure about this. The posted paper is talking about "standard" raid made with hw raid controllers and I'm not sure if they are able to finish a rebuild in case of double URE even if coming from different stripes. I think they fail the whole rebuild. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html