On 20/08/17 16:48, Mikael Abrahamsson wrote: > On Mon, 21 Aug 2017, Adam Goryachev wrote: > >> data (even where it is wrong). So just do a check/repair which will >> ensure both drives are consistent, then you can safely do the fsck. >> (Assuming you fixed the problem causing random write errors first). > > This involves manual intervention. > > While I don't know how to implement this, let's at least see if we can > architect something for throwing ideas around. > > What about having an option for any raid level that would do "repair on > read". So you can do "0" or "1" on this. RAID1 would mean it reads all > stripes and if there is inconsistency, pick one and write it to all of > them. It could also be some kind of IOCTL option I guess. For RAID5/6, > read all data drives, and check parity. If parity is wrong, write parity. > > This could mean that if filesystem developers wanted to do repair (and > this could be a userspace option or mount option), it would use the > beforementioned option for all fsck-like operation to make sure that > metadata was consistent while doing fsck (this would be different for > different tools, if it's an "fs needs to be mounted"-type of fs, or if > it's an "offline fsck" type filesystem. Then it could go back to normal > operation for everything else that would hopefully not cause > catastrophical failures to the filesystem, but instead just individual > file corruption in case of mismatches. > Look for the thread "RFC Raid error detection and auto-recovery, 10th May. Basically, that proposed a three-way flag - "default" is the current "read the data section", "check" would read the entire stripe and compare a mirror or calculate parity on a raid and return a read error if it couldn't work out the correct data, and "fix" would write the correct data back if it could work it out. So basically, on a two-disk raid-1, or raid 4 or 5, both "check" and "fix" would return read errors if there's a problem and you're SOL without a backup. With a three-disk or more raid-1, or raid-6, it would return the correct data (and fix the stripe) if it could, otherwise again you're SOL. Cheers, Wol -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html