Re: mismatch_cnt questions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



H. Peter Anvin wrote:
Eyal Lebedinsky wrote:
Neil Brown wrote:
[trim Q re how resync fixes data]
For raid1 we 'fix' and inconsistency by arbitrarily choosing one copy
and writing it over all other copies.
For raid5 we assume the data is correct and update the parity.

Can raid6 identify the bad block (two parity blocks could allow this
if only one block has bad data in a stripe)? If so, does it?

This will surely mean more value for raid6 than just the two-disk-failure
protection.


No.  It's not mathematically possible.


Okay, I've thought about it, and I got it wrong the first time (off-the-cuff misapplication of the pigeonhole principle.)

It apparently *is* possible (for notation and algebra rules, see my paper):

Let's assume we know exactly one of the data (Dn) drives is corrupt (ignoring the case of P or Q corruption for now.) That means instead of Dn we have a corrupt value, Xn. Note that which data drive that is corrupt (n) is not known.

We compute P' and Q' as the computed values over the corrupt set.

P+P' = Dn+Xn
Q+Q' = g^n Dn + g^n Xn		g = {02}

Q+Q' = g^n (Dn+Xn)

By assumption, Dn != Xn, so P+P' = Dn+Xn != {00}.
g^n is *never* {00}, so Q+Q' = g^n (Dn+Xn) != {00}.

(Q+Q')/(P+P') = [g^n (Dn+Xn)]/(Dn+Xn) = g^n

Since n is known to be in the range [0,255), we thus have:

n = log_g((Q+Q')/(P+P'))

... which is a well-defined relation.

For the case where either the P or the Q drives are corrupt (and the data drives are all good), this is easily detected by the fact that if P is the corrupt drive, Q+Q' = {00}; similarly, if Q is the corrupt drive, P+P' = {00}. Obviously, if P+P' = Q+Q' = {00}, then as far as RAID-6 can discover, there is no corruption in the drive set.

So, yes, RAID-6 *can* detect single drive corruption, and even tell you which drive it is, if you're willing to compute a full syndrome set (P', Q') on every read (as well on every write.)

Note: RAID-6 cannot detect 2-drive corruption, unless of course the corruption is in different byte positions. If multiple corresponding byte positions are corrupt, then the algorithm above will generally point you to a completely innocent drive.

	-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux