Re: raid6 with dm-integrity should not cause device to fail

Chris Murphy <lists@xxxxxxxxxxxxxxxxx> · Thu, 5 Sep 2019 14:15:17 -0600

On Thu, Sep 5, 2019 at 1:06 PM Robin Hill <robin@xxxxxxxxxxxxxxx> wrote:
>
> It's definitely a policy question, yes, and more flexibility in how
> these errors are handled would indeed be good. The specific cases here
> are thousands of integrity mismatches artificially introduced into
> sequential blocks covering half the device though. I don't see any
> reasonable error-handling method doing anything other than kicking the
> drive in that case. Ignoring them on the basis that they're dm-integrity
> mismatches rather than read errors reported from the drive does not
> sound like the right fix (unless we're expecting dm-integrity, or the
> block-layer generally, to have built-in error counting and
> device-failing?).

I think that's reasonable, and also that's a lot of, what amounts to
as, spamming going into the journal. I would still say kernel code
should be able to handle it. If there are many in-sequence errors, I
think it's OK to drop a lot of those messages as part of e.g.
printk.devkmsg option. For sure by default no one wants so many
effectively redundant errors being reported that journald/syslog is
being blocked, such that other errors can't be quickly/reliably
logged.

And I defer to md/mdadm/lvm devs to pick a sane generic one size fits
all policy to eject a drive in this case. For sure if it's reading all
failures for even 10-20 seconds, that's GiB's of bad data with
spinning rust, and possibly hundreds of GiB's if it's flash, and I
absolutely agree - eject it. Or more correctly, demote it. Do not read
from it *unless* you arrive at a stripe read failure, and all you need
is just one more successful strip read to get a successful
reconstruction, and then in that case, why not try reading from that
demoted drive? If it's totally ejected, you have no chance.

> > > Admittedly, with dm-integrity we can probably trust that anything read
> > > from the disk which makes it past the integrity check is valid, so there
> > > may be cases where the data on there is needed to complete a stripe.
> > > That seems a rather theoretical and contrived circumstance though - in
> > > most cases you're better just kicking the drive from the array so the
> > > admin knows that it needs replacing.
> >
> > I don't agree that a heavy hammer is needed in order to send a notification.
> >
> You think that most people using this will be monitoring for
> dm-intergity reported errors? If all the errors are just rewritten
> silently then it's likely the only sign of an issue will be a
> performance impact, with no obvious sign as to where it's coming from.

I very well might want a policy that says, send a notification if more
than 10 errors of any nature are encountered within 1 minute or less.
Maybe that drive gets scheduled for swap out sooner than later, but
not urgently. But ejecting the drive, upon many errors, to act as the
notification of a problem, I don't like that design. Those are
actually two different problems, and I'm not being informed of the
initial cause only of the far more urgent "drive ejected" case.

-- 
Chris Murphy