Re: Fault tolerance with badblocks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/05/17 13:27, Nix wrote:
> On 9 May 2017, David Brown uttered the following:

> (I'm not suggesting repairing RAID-5 mismatches. That's clearly
> impossible. You can't even tell what disk is affected. But in the RAID-6
> case none of this is impossible, or so it seems to me. You have at least
> three and probably four or more drives with consistent syndromes, and
> one that is out of whack. You know which one must be wrong -- the
> "minority vote" -- and you know what has to be done to make it
> consistent with the others again. Why not do it? It's no more risky than
> that aspect of a RAID rebuild from a failed disk would be.)
> 
>> RAID will /not/ let you reliably detect or correct other sorts of
>> errors.
> 
> ... only it clearly can. What stops it from handling the RAID-6-and-
> one-disk-is-wrong case where it cannot handle the RAID-6-and-one-disk-
> has-failed case, given that you can unambiguously determine which disk
> is wrong using the data on the surviving drives, with an undetected-
> failure probability of something way below 2^128? (I could work out the
> actual value but I haven't had any coffee yet and it seems pointless
> when it's that low.)
> 
>> What does /not/ work, however, is trying to squeeze magic capabilities
>> out of existing layers in the system, or expecting more out of them that
>> they can give.
> 
> I don't see that these capabilities are any more magic than what RAID-6
> does already. It can recover from two failed drives: why can't it
> recover from one wrong one? (Or, rather, from one drive with very
> occasionally wrong sectors on it. Obviously if it was always getting
> things wrong its presence is not a benefit and you have essentially
> fallen back to nothing better than RAID-5, only with worse performance.
> But that's what error thresholds are for, which md already employs in
> similar situations.)
> 

I thought you said that you had read Neil's article.  Please go back and
read it again.  If you don't agree with what is written there, then
there is little more I can say to convince you.

One thing I can try, is to note that you are /not/ the first person to
think "Surely with RAID-6 we can correct mismatches - it should be
easy?".  You are /not/ the first person to think "Correcting RAID-6
mismatches would be a marvellous feature that would make it /far/
better".  Linux md raid does not correct RAID-6 mismatches found on a
scrub.  To my (admittedly limited) knowledge, hardware RAID-6 systems do
not correct mismatches found on a scrub.  If correcting RAID-6
mismatches were as simple, reliably, and useful as you seem to believe,
than I think Linux md raid would already do it - either as part of the
scrub, or as an extra utility to run on mismatched stripes.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux