On 16 May 2017, NeilBrown said: >> - else { >> + pr_warn_ratelimited("%s: mismatch around sector " >> + "%llu\n", __func__, >> + (unsigned long long) >> + sh->sector); >> + } else { > > I think there is no point giving the function name, > but that you should give the name of the array. *ouch* I can't believe I forgot that. I have more than one array myself... "we have a fault but we don't know what array it's on" is not much of an improvement over the status quo, really! (though you could make a good guess by looking for preceding sync-start messages, you can of course sync two arrays at the same time...) > Also "around" is a little vague. Intentionally: I couldn't think of the right terminology. Yours is better. > Maybe something like: > >> + pr_warn_ratelimited("%s: mismatch sector in range " >> + "%llu-%llu\n", mdname(conf->mddev), >> + (unsigned long long) sh->sector, >> + (unsigned long long) sh->sector + STRIPE_SECTORS); Nice! Here's a rerolled patch. (We exceed the 80-char limit but that's pr_warn_ratelimited()'s fault for having such a long name!) Tested by making a raid array on a bunch of sparse files then dding a byte of garbage into one of them and checking it. I got a nice error message, name and all, and the sector count looked good. >From f05a451d46900849c7965a0e7dde085f1fb50dfc Mon Sep 17 00:00:00 2001 From: Nick Alcock <nick.alcock@xxxxxxxxxx> Date: Tue, 9 May 2017 21:55:17 +0100 Subject: [PATCH] md: report sector of stripes with check mismatches This makes it possible, with appropriate filesystem support, for a sysadmin to tell what is affected by the mismatch, and whether it should be ignored (if it's inside a swap partition, for instance). We ratelimit to prevent log flooding: if there are so many mismatches that ratelimiting is necessary, the individual messages are relatively unlikely to be important (either the machine is swapping like crazy or something is very wrong with the disk). Signed-off-by: Nick Alcock <nick.alcock@xxxxxxxxxx> --- drivers/md/raid5.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index ed5cd705b985..937314051be5 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -3959,10 +3959,15 @@ static void handle_parity_checks5(struct r5conf *conf, struct stripe_head *sh, set_bit(STRIPE_INSYNC, &sh->state); else { atomic64_add(STRIPE_SECTORS, &conf->mddev->resync_mismatches); - if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery)) + if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery)) { /* don't try to repair!! */ set_bit(STRIPE_INSYNC, &sh->state); - else { + pr_warn_ratelimited("%s: mismatch sector in range " + "%llu-%llu\n", mdname(conf->mddev), + (unsigned long long) sh->sector, + (unsigned long long) sh->sector + + STRIPE_SECTORS); + } else { sh->check_state = check_state_compute_run; set_bit(STRIPE_COMPUTE_RUN, &sh->state); set_bit(STRIPE_OP_COMPUTE_BLK, &s->ops_request); @@ -4111,10 +4116,15 @@ static void handle_parity_checks6(struct r5conf *conf, struct stripe_head *sh, } } else { atomic64_add(STRIPE_SECTORS, &conf->mddev->resync_mismatches); - if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery)) + if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery)) { /* don't try to repair!! */ set_bit(STRIPE_INSYNC, &sh->state); - else { + pr_warn_ratelimited("%s: mismatch sector in range " + "%llu-%llu\n", mdname(conf->mddev), + (unsigned long long) sh->sector, + (unsigned long long) sh->sector + + STRIPE_SECTORS); + } else { int *target = &sh->ops.target; sh->ops.target = -1; -- 2.12.2.212.gea238cf35.dirty -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html