On 9 May 2017, nix@xxxxxxxxxxxxx outgrape: > On 9 May 2017, Chris Murphy verbalised: > >> 1. md reports all data drives and the LBAs for the affected stripe > > Enough rambling from me. Here's a hilariously untested patch against > 4.11 (as in I haven't even booted with it: my systems are kind of in > flux right now as I migrate to the md-based server that got me all > concerned about this). It compiles! And it's definitely safer than > trying a repair, and makes it possible to recover from a real mismatch > without losing all your hair in the process, or determine that a > mismatch is spurious or irrelevant. And that's enough for me, frankly. > This is a very rare problem, one hopes. > > (It's probably not ideal, because the error is just known to be > somewhere in that stripe, not on that sector, which makes determining > the affected data somewhat harder. But at least you can figure out what > filesystem it's on. :) ) Aside: this foolish optimist hopes that it might be fairly easy to tie the new GETFSMAP ioctl() into mismatch reports if the filesystem(s) overlying a mismatched stripe support it: it looks like we could get the necessary info for a whole stripe in a single call. Being automatically told "these files may be corrupted, restore them" or "oops you lost some metadata on fses A and B, run fsck" would be wonderful. (Though the actual corruption would be less wonderful.) This feels like something mdadm's monitor mode should be able to do, to me. I'll have a look in a bit, but I know nothing about the implementation of monitor mode at all so I have some learning to do first... -- NULL && (void) -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html