On 9 May 2017, Chris Murphy verbalised: > 1. md reports all data drives and the LBAs for the affected stripe Enough rambling from me. Here's a hilariously untested patch against 4.11 (as in I haven't even booted with it: my systems are kind of in flux right now as I migrate to the md-based server that got me all concerned about this). It compiles! And it's definitely safer than trying a repair, and makes it possible to recover from a real mismatch without losing all your hair in the process, or determine that a mismatch is spurious or irrelevant. And that's enough for me, frankly. This is a very rare problem, one hopes. (It's probably not ideal, because the error is just known to be somewhere in that stripe, not on that sector, which makes determining the affected data somewhat harder. But at least you can figure out what filesystem it's on. :) ) 8<------------------------------------------------------------->8 From: Nick Alcock <nick.alcock@xxxxxxxxxx> Subject: [PATCH] md: report sector of stripes with check mismatches This makes it possible, with appropriate filesystem support, for a sysadmin to tell what is affected by the mismatch, and whether it should be ignored (if it's inside a swap partition, for instance). We ratelimit to prevent log flooding: if there are so many mismatches that ratelimiting is necessary, the individual messages are relatively unlikely to be important (either the machine is swapping like crazy or something is very wrong with the disk). Signed-off-by: Nick Alcock <nick.alcock@xxxxxxxxxx> --- drivers/md/raid5.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index ed5cd705b985..bcd2e5150e29 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -3959,10 +3959,14 @@ static void handle_parity_checks5(struct r5conf *conf, struct stripe_head *sh, set_bit(STRIPE_INSYNC, &sh->state); else { atomic64_add(STRIPE_SECTORS, &conf->mddev->resync_mismatches); - if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery)) + if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery)) { /* don't try to repair!! */ set_bit(STRIPE_INSYNC, &sh->state); - else { + pr_warn_ratelimited("%s: mismatch around sector " + "%llu\n", __func__, + (unsigned long long) + sh->sector); + } else { sh->check_state = check_state_compute_run; set_bit(STRIPE_COMPUTE_RUN, &sh->state); set_bit(STRIPE_OP_COMPUTE_BLK, &s->ops_request); @@ -4111,10 +4115,14 @@ static void handle_parity_checks6(struct r5conf *conf, struct stripe_head *sh, } } else { atomic64_add(STRIPE_SECTORS, &conf->mddev->resync_mismatches); - if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery)) + if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery)) { /* don't try to repair!! */ set_bit(STRIPE_INSYNC, &sh->state); - else { + pr_warn_ratelimited("%s: mismatch around sector " + "%llu\n", __func__, + (unsigned long long) + sh->sector); + } else { int *target = &sh->ops.target; sh->ops.target = -1; -- 2.12.2.212.gea238cf35.dirty -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html