On Fri, May 19 2017, Shaohua Li wrote: > On Fri, May 19, 2017 at 11:32:43AM +0100, Nix wrote: >> On 19 May 2017, NeilBrown said: >> >> > On Tue, May 16 2017, Nix wrote: >> > >> >> On 16 May 2017, NeilBrown spake thusly: >> >> >> >>> Actually, I have another caveat. I don't think we want these messages >> >>> during initial resync, or any resync. Only during a 'check' or >> >>> 'repair'. >> >>> So add a check for MD_RECOVERY_REQUESTED or maybe for >> >>> sh->sectors >= conf->mddev->recovery_cp >> >> >> >> I completely agree, but it's already inside MD_RECOVERY_CHECK: >> >> >> >> if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery)) { >> >> /* don't try to repair!! */ >> >> set_bit(STRIPE_INSYNC, &sh->state); >> >> pr_warn_ratelimited("%s: mismatch sector in range " >> >> "%llu-%llu\n", mdname(conf->mddev), >> >> (unsigned long long) sh->sector, >> >> (unsigned long long) sh->sector + >> >> STRIPE_SECTORS); >> >> } else { >> >> >> >> Doesn't that already mean that someone has explicitly triggered a check >> >> action? >> > >> > Uhmm... yeah. I lose track of which flags me what exactly. >> > You log messages aren't generated when 'repair' is used, only when >> > 'check' is. >> > I can see why you might have chosen that, but I wonder if it is best. >> >> I'm not sure what the point is of being told when repair is used: hey, >> there was an inconsistency here but there isn't any more! I suppose you >> could still use it to see if the repair did the right thing. My problem >> on that front was that I'm not sure what flag should be used to catch >> repair but not resync etc: everywhere else in the code, repair is in an >> unadorned else branch... is it the *lack* of MD_RECOVERY_CHECK and the >> presence of, uh, something else? > MD_RECOVERY_SYNC && MD_RECOVERY_REQUESTED && MD_RECOVERY_CHECK == check > MD_RECOVERY_SYNC && MD_RECOVERY_REQUESTED == repair > MD_RECOVERY_SYNC && !MD_RECOVERY_REQUESTED == resync > > Don't see the poin to print the info for 'repair'. 'repair' already changes the > data, how could we use the info? Surprising data is can be potentially valuable. I don't think you should *ever* get an inconsistency in a RAID6 unless you have faulty hardware. If you do, then any information about the nature of the inconsistency might be valuable in understanding the hardware fault. I don't know in advance how I would interpret the data, but I do know that if I didn't have the data, then I wouldn't be able to interpret it. However .... running "repair" when you don't know exactly what has happened and why, is probably a bad idea. So logging probably won't provide value. I wouldn't go out of my way to add extra logging for the 'repair' case, but I certainly wouldn't go out of my way to avoid logging in that case. It seems inconsistent to log for 'check' but not 'repair', but it isn't a big deal for me. NeilBrown
Attachment:
signature.asc
Description: PGP signature