Re: RFC - Raid error detection and auto-recovery (was Fault tolerance with badblocks)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/16/2017 06:33 AM, Wols Lists wrote:
> On 15/05/17 23:31, Phil Turmel wrote:

>> If and only if it is known that all but the supposedly corrupt block
>> were written together (complete stripe) and no possibility of
>> perturbation occurred between the original calculation of P,Q in the CPU
>> and original transmission of all of these blocks to the member drives.
> 
> NO! This is a "can't see the wood for the trees" situation.

You can shout NO all you want, and make inapplicable metaphors, but you
are still wrong.

> If one block
> in a raid-6 is corrupt, we can correct it. That's maths, that's what the
> maths says, and it is not only possible, but *definite*.

The math has preconditions.  If the preconditions are unmet, or unknown,
you cannot use the math.

> WHAT caused the corruption, and HOW, is irrelevant. The only requirement
> is that *just one block is lost*. If that's the case we can recover.

WHAT and HOW are the preconditions to the math.  The algorithm you seek
exists as a userspace utility that an administrator can use after
suitable analysis of the situation.  Feel free to script a call to that
utility on *your* system whenever your check scrub signals a mismatch.

> At the end of the day, as I see it, MD raid *can* do data integrity. So
> if the user thinks the performance hit is worth it, why not?

You are seeing a mirage due to a naive application of the math.

> MD raid *can* do data recovery. So why not?

It *cannot* do it for reasons many of us have tried to explain.  Sorry.

> And yes, given the opportunity I will write it myself. I just have to be
> honest and say my family situation interferes with that desire fairly
> drastically (which is why I've put a lot of effort in elsewhere, that
> doesn't require long stretches of concentration).

As I said to Nix, no system administrator who cares about their data
will touch a kernel that includes such a patch.

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux