On Tue, Mar 26, 2019 at 9:08 AM Nigel Croxon <ncroxon@xxxxxxxxxx> wrote: > > Changing state from check_state_check_result to > check_state_compute_result not only is unsafe but also doesn't > appear to serve a valid purpose. A raid6 check should only be > pushing out extra writes if doing repair and a mis-match occurs. > The stripe dev management will already try and do repair writes > for failing sectors. Could you please explain more details about the failed case before this patch? Thanks, Song > > This patch makes the raid6 check_state_check_result handling > work more like raid5's. If somehow too many failures for a > check, just quit the check operation for the stripe. When any > checks pass, don't try and use check_state_compute_result for > a purpose it isn't needed for and is unsafe for. Just mark the > stripe as in sync for passing its parity checks and let the > stripe dev read/write code and the bad blocks list do their > job handling I/O errors. > > OriginalAuthor: David Jeffery <djeffery@xxxxxxxxxx> > Tested-by: David Jeffery <djeffery@xxxxxxxxxx> > Signed-off-by: David Jeffy <djeffery@xxxxxxxxxx> > Signed-off-by: Nigel Croxon <ncroxon@xxxxxxxxxx> > --- > drivers/md/raid5.c | 19 ++++--------------- > 1 file changed, 4 insertions(+), 15 deletions(-) > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index c033bfcb209e..4204a972ecf2 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -4223,26 +4223,15 @@ static void handle_parity_checks6(struct r5conf *conf, struct stripe_head *sh, > case check_state_check_result: > sh->check_state = check_state_idle; > > + if (s->failed > 1) > + break; > /* handle a successful check operation, if parity is correct > * we are done. Otherwise update the mismatch count and repair > * parity if !MD_RECOVERY_CHECK > */ > if (sh->ops.zero_sum_result == 0) { > - /* both parities are correct */ > - if (!s->failed) > - set_bit(STRIPE_INSYNC, &sh->state); > - else { > - /* in contrast to the raid5 case we can validate > - * parity, but still have a failure to write > - * back > - */ > - sh->check_state = check_state_compute_result; > - /* Returning at this point means that we may go > - * off and bring p and/or q uptodate again so > - * we make sure to check zero_sum_result again > - * to verify if p or q need writeback > - */ > - } > + /* Any parity checked was correct */ > + set_bit(STRIPE_INSYNC, &sh->state); > } else { > atomic64_add(STRIPE_SECTORS, &conf->mddev->resync_mismatches); > if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery)) { > -- > 2.20.1 >