Re: Fault tolerance with badblocks

Chris Murphy <lists@xxxxxxxxxxxxxxxxx> · Tue, 9 May 2017 11:25:30 -0600

On Tue, May 9, 2017 at 5:58 AM, David Brown <david.brown@xxxxxxxxxxxx> wrote:

> I thought you said that you had read Neil's article.  Please go back and
> read it again.  If you don't agree with what is written there, then
> there is little more I can say to convince you.
>
> One thing I can try, is to note that you are /not/ the first person to
> think "Surely with RAID-6 we can correct mismatches - it should be
> easy?".

H. Peter Anvin's RAID 6 paper, section 4 is what's apparently under discussion
http://milbret.anydns.info/pub/linux/kernel/people/hpa/raid6.pdf

This is totally non-trivial, especially because it says raid6 cannot
detect or correct more than one corruption, and ensuring that
additional corruption isn't introduced in the rare case is even more
non-trivial.

I do think it's sane for raid6 repair to avoid the current assumption
that data strip is correct, by doing the evaluation in equation 27. If
there's no corruption do nothing, if there's corruption of P or Q then
replace, if there's corruption of data, then report but do not repair
as follows:

1. md reports all data drives and the LBAs for the affected stripe
(otherwise this is not simple if it has to figure out which drive is
actually affected but that's not required, just a matter of better
efficiency in finding out what's really affected.)

2. the file system needs to be able to accept the error from md

3. the file system reports what it negatively impacted: file system
metadata or data and if data, the full filename path.

And now suddenly this work is likewise non-trivial.

And there is already something that will do exactly this: ZFS and
Btrfs. Both can unambiguously, efficiently determine whether data is
corrupt even if a drive doesn't report a read error.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html