Neil Brown wrote:
On Friday April 13, chris@xxxxxxx wrote:
Dear All,
I have an 8-drive raid-5 array running under 2.6.11. This morning it
bombed out, and when I brought
it up again, two drives had incorrect event counts:
sda1: 0.8258715
sdb1: 0.8258715
sdc1: 0.8258715
sdd1: 0.8258715
sde1: 0.8258715
sdf1: 0.8258715
sdg1: 0.8258708
sdh1: 0.8258716
sdg1 is out of date (expected), but sdh1 has received an extra event.
Any attempt to restart with mdadm --assemble --force, results in an an
un-startable array with an event count of 0.8258715.
Can anybody advise on the correct command to use to get it started again?
I'm assuming I'll need to use mdadm --create --assume-clean - but I'm
not sure
which drives should be included/excluded when I do this.
A difference of 1 in event counts is not supposed to cause a problem.
Have you tried simply assembling the array without including sdg1.
e.g.
mdadm -A /dev/md0 /dev/sd[abcdefh]1
Further to this, I have tried upgrading the kernel to 2.6.17. I get the
same errors.
Don't know if it is any use, but here is the tail of an strace for an
assemble command for
both the bad system and a similar good system:
STRACE FROM ASSEMBLE - BAD ARRAY:
_llseek(4, 500105150464, [500105150464], SEEK_SET) = 0
read(4, "\374N+\251\0\0\0\0Z\0\0\0\1\0\0\0\0\0\0\0\371S\2621I\311"...,
4096) = 4096
close(4) = 0
stat64("/dev/sdi1", {st_mode=S_IFBLK|0640, st_rdev=makedev(8, 129),
...}) = 0
open("/dev/sdb1", O_RDONLY|O_EXCL) = 4
ioctl(4, BLKGETSIZE64, 0xbffdf150) = 0
ioctl(4, BLKFLSBUF, 0) = 0
_llseek(4, 500105150464, [500105150464], SEEK_SET) = 0
read(4, "\374N+\251\0\0\0\0Z\0\0\0\1\0\0\0\0\0\0\0\371S\2621I\311"...,
4096) = 4096
close(4) = 0
ioctl(3, 0x40480923, 0xbffdf2c0) = 0
ioctl(3, 0x40140921, 0xbffdf324) = 0
ioctl(3, 0x40140921, 0xbffdf324) = 0
ioctl(3, 0x40140921, 0xbffdf324) = 0
ioctl(3, 0x40140921, 0xbffdf324) = 0
ioctl(3, 0x40140921, 0xbffdf324) = 0
ioctl(3, 0x40140921, 0xbffdf324) = 0
ioctl(3, 0x40140921, 0xbffdf324) = 0
ioctl(3, 0x400c0930, 0) = -1 EIO (Input/output error)
write(2, "mdadm: failed to RUN_ARRAY /dev/"..., 56mdadm: failed to
RUN_ARRAY /dev/md0: Input/output error
) = 56
exit_group(1) = ?
SAME COMMAND, GOOD ARRAY:
_llseek(4, 500105150464, [500105150464], SEEK_SET) = 0
read(4, "\374N+\251\0\0\0\0Z\0\0\0\0\0\0\0\0\0\0\0\316\360\34;:"...,
4096) = 4096
close(4) = 0
stat64("/dev/sdh1", {st_mode=S_IFBLK|0640, st_rdev=makedev(8, 113),
...}) = 0
open("/dev/sda1", O_RDONLY|O_EXCL) = 4
ioctl(4, BLKGETSIZE64, 0xbfcae6d8) = 0
ioctl(4, BLKFLSBUF, 0) = 0
_llseek(4, 500105150464, [500105150464], SEEK_SET) = 0
read(4, "\374N+\251\0\0\0\0Z\0\0\0\0\0\0\0\0\0\0\0\316\360\34;:"...,
4096) = 4096
close(4) = 0
ioctl(3, 0x40480923, 0xbfcae800) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x400c0930, 0) = 0
write(2, "mdadm: /dev/md0 has been started"..., 46mdadm: /dev/md0 has
been started with 8 drives) = 46
write(2, ".\n", 2.
) = 2
exit_group(0) = ?
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html