Theodore Tso wrote:
On Thu, Mar 20, 2008 at 03:19:08PM +0100, Bas van Schaik wrote:
There's no explicit message produced by the md module, no. You need to
check the /sys/block/md{X}/md/mismatch_cnt entry to find out how many
mismatches there are. Similarly, following a repair this will indicate
how many mismatches it thinks have been fixed (by updating the parity
block to match the data blocks).
Marvellous! I naively assumed that the module would warn me, but that's
not true. Wouldn't it be appropriate to print a message to dmesg if such
a mismatch occurs during a check? Such a mismatch clearly means that
there is something wrong with your hardware lying beneath md, doesn't it?
If a mismatch is detected in a RAID-6 configuration, it should be
possible to figure out what should be fixed (since with two hot spares
there should be enough redundancy not only to detect an error, but to
correct it.) Out of curiosity, does md do this automatically, either
when reading from a stripe, or during a resync operation?
In my modest experience with root/high performance spool on various raid
levels I can pretty much conclude that the current check mechanism doesn't do
enough to give power to the user. We can debate all we want about what the MD
driver should do when it finds a mismatch, yet there is no way for the user to
figure out what the mismatch is and take appropriate action. This does not
apply only to RIAD5/6 - what about RAID1/10 with >2 chunk copies? What if the
only wrong value is taken and written all over the other good blocks?
I think that the solution is rather simple, and I would contribute a patch if
I had any C experience. The current check mechanism remains the same -
mismatch_cnt is incremented/reset just the same as before. However on every
mismatching chunk the system printks the following:
1) the start offset of the chunk(md1/10) or stripe(md5/6) within the MD device
2) one line for every active disk containing:
a) the offset of the chunk within the MD componnent
b) a {md5|sha1}sum of the chunk
In a common case array this will take no more than 8 lines in dmesg. However
it will allow:
1) For a human to determine at a glance which disk holds a mismatching chunk
in raid 1/10
2) Determine the same for raid 6 using a userspace tool which will calculate
the parity for every possible permutation of chunks
3) using some external tools to determine which file might have been affected
on the layered file system
Now of course the problem remains how to repair the array using the
information obtained above. I think the best way would be to extend the syntax
of repair itself, so that:
echo repair > .../sync_action would use the old heuristics
echo repair <mdoffset> <component N> > .../sync_action will update the chunk
on drive N which corresponds to the chunk/stripe at mdoffset within the MD
device, using the information from the other drives, and not the other way
around as might happen with just a repair.
Just my 2c
Peter
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html