Re: raid6 with dm-integrity should not cause device to fail

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 5, 2019 at 10:15 AM Robin Hill <robin@xxxxxxxxxxxxxxx> wrote:
>
>
> I'm not clear what you (or the author of the article) are expecting
> here. You've got a disk (or disks) with thousands of read errors -
> whether these are dm-integrity mismatches or disk-level read errors
> doesn't matter - the disk is toast and needs replacing ASAP (or it's in
> an environment where it - and you - probably shouldn't be).

That sounds to me like a policy question. The kernel code should be
able to handle the errors, including even rate limiting if the errors
are massive. It's a policy question whether X number errors per unit
time, or Y:Z ratio bad to good sectors have been read, is the limit.
And it's reasonable for md developers to pick a sane default for that
policy. But to just say 1000's of corruptions are inherently a device
failure, when easily 1 million more in the same time frame are good?
You'd be giving up a better chance of recovery during rebuilds/device
replacements by flat out ejecting such a device. Also the device could
be network. It could be transient. Or the problem discovered and fixed
way before the device is ejected, and manually readded and rebuilt.


> Admittedly, with dm-integrity we can probably trust that anything read
> from the disk which makes it past the integrity check is valid, so there
> may be cases where the data on there is needed to complete a stripe.
> That seems a rather theoretical and contrived circumstance though - in
> most cases you're better just kicking the drive from the array so the
> admin knows that it needs replacing.

I don't agree that a heavy hammer is needed in order to send a notification.


-- 
Chris Murphy



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux