Il 21-08-2017 10:37 Mikael Abrahamsson ha scritto:
This doesn't solve the problem because it doesn't check if the second mirror is out of sync with the first one, because it'll only detect writes to the degraded array and sync those. It doesn't fix the "fsck read the block and it was fine, but on the second drive it's not fine".
As stated elsewhere, you can re-attach a detached device with "--add-spare": this will copy *all* data from the other mirror leg. However, it is vastly better to simple issue a "repair" action. Anyway, the basic problem remains: with larger drives, this will take many hours or even days.
However, this again causes the problem that if there is an URE on the degraded array remaining drive, things will fail.
On relatively recent MDRAID code (kernel > 3.5.x), a degraded array with a URE in another disk will *not* totally fail the array. Rather, a badblock is logged into MDRAID superblock and a read error is returned to upper layers.
Anyway, this has little to do with the main problem: micro power losses can cause undetected, silent data corruption, even with synced writes.
The only way to solve this is to add more code to implement a new mode which would be "repair-on-read". I understand that we can't necessarily detect which drive has the right or wrong information, but at least we can this way make sure that when fsck is done, all the inodes and other metadata is now consistent. Everything that fsck touched during the fsck will be consistent across all drives, with correct parity. It might not contain the "best" information that could have been presented by a more intelligent algorithm/metadata, but at least it's better than today when after a fsck run you don't know if parity is correct or not. It would also be a good diagnostic tool for admins. If you suspect that you're getting inconsistencies but you're fine with the performance degradation then md could log inconsistencies somewhere so you know about them.
I second that. Thanks. -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.danti@xxxxxxxxxx - info@xxxxxxxxxx GPG public key ID: FF5F32A8 -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html