Re: mismatch_cnt questions

Neil Brown <neilb@xxxxxxx> · Mon, 5 Mar 2007 09:30:32 +1100

On Monday March 5, eyal@xxxxxxxxxxxxxx wrote:
> Neil Brown wrote:
> > On Sunday March 4, pernegger@xxxxxxxxx wrote:
> >>I have a mismatch_cnt of 384 on a 2-way mirror.
> [trim]
> >>3) Is the "repair" sync action safe to use on the above kernel? Any
> >>other methods / additional steps for fixing this?
> > 
> > "repair" is safe, though it may not be effective.
> > "repair" for raid1 was did not work until Jan 26th this year.
> > Before then it was identical in effect to 'check'.
> 
> How is "repair" safe but not effective? When it finds a mismatch, how does
> it know which part is correct and which should be fixed (which copy of
> raid1, or which block in raid5)?

It is not 'effective' in that before 26jan2007 it did not actually
copy the chosen data on to the other drives.  i.e. a 'repair' had the
same effect as a 'check', which is 'safe'.

> 
> When a disk fails we know what to rewrite, but when we discover a mismatch
> we do not have this knowledge. It may corrupt the good copy of a raid1.

If a block differs between the different drives in a raid1, then no
copy is 'good'.  It is possible that one copy is the one you think you
want, but you probably wouldn't know by looking at it.
The worst situation is the have inconsistent data. If you read and get
one value, then later read and get another value, that is really bad.

For raid1 we 'fix' and inconsistency by arbitrarily choosing one copy
and writing it over all other copies.
For raid5 we assume the data is correct and update the parity.

You might be able to imagine a failure scenario where this produces
the 'wrong' result, but I'm confident that is the majority of cases it
is as good as any other option.

If we had something like ZFS which tracks checksums for all blocks,
and could somehow get that information usefully into the md level,
than maybe we could do something better.

I suspect that it would be very rare for raid5 to detect a mismatch
during a 'check', and raid1 would only see them when a write was
aborted, such as swap can do, and filesystems might do occasionally
(e.g. truncate a file that was recently written to).

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html