Re: Can't get rid of RAID-5 mismatches

"Dan Williams" <dan.j.williams@xxxxxxxxx> · Thu, 1 May 2008 17:21:11 -0700

On Thu, May 1, 2008 at 2:19 PM, George Spelvin <linux@xxxxxxxxxxx> wrote:
[..]
>  But let me just ask... the RAID-5 repair code is known to work, right?
>  So the situation I've got above points to some lower-level problem?
>  It's not just somehow forgetting to write out the corrections and
>  I'm seeing the same mismatches over and over again?
>
>  Any other debugging suggestions?

I can reproduce this here, and until I can track down what happened
the fix is reverting commit bd2ab67030e9116f1e4aae1289220255412b37fd
"md: close a livelock window in handle_parity_checks5".

That fix was tested to close a livelock condition for which I had a
reproducible test case, but I did not regression test 'echo repair >
sync_action'.  My fear was that this compromised re-adding a dirty
disk to a degraded array but that appears unaffected.

$ mdadm --create /dev/md0 /dev/loop[0-3] -n 4 -l5
mdadm: array /dev/md0 started.
$ dd if=/dev/zero of=/dev/md0 #initialize with a known pattern
dd: writing to `/dev/md0': No space left on device
153217+0 records in
153216+0 records out
78446592 bytes (78 MB) copied, 1.64838 s, 47.6 MB/s
$ md5sum /dev/md0 # we should get this same checksum later on
82eb2aa05c6736d9215c430aa31f7cf3  /dev/md0
$ mdadm --fail /dev/md0 /dev/loop0
mdadm: set /dev/loop0 faulty in /dev/md0
$ mdadm --remove /dev/md0 /dev/loop0
mdadm: hot removed /dev/loop0
$ dd if=/data_dir/datafile of=/dev/loop0 oflag=sync #dirty the failed disk
dd: writing to `/dev/loop0': No space left on device
51201+0 records in
51200+0 records out
26214400 bytes (26 MB) copied, 2.89976 s, 9.0 MB/s
$ mdadm --add /dev/md0 /dev/loop0
mdadm: added /dev/loop0
$ echo 1 > /proc/sys/vm/drop_caches
$ md5sum /dev/md0
82eb2aa05c6736d9215c430aa31f7cf3  /dev/md0 # recovery successful

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html