I have a 5 disk raid5 array that had a disk failure. I removed the disk,
added a new one (and a spare), and recovery began. Halfway through recovery,
a second disk failed.
However, while the first disk really was dead, the second seems to have been
a transient error, as the smart data and disk testing seem to show the disk
is fine.
The question is, how can I tell mdadm to unfail this second disk. From what
I've found in the archives, I think I need to use the --force option, but
I'm concern about getting device names in the wrong order (and totally
destroying my array in the process), so thought I'd ask here first. Here is
my /proc/mdstat when recovery initially began:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4]
[raid10]
md1 : active raid5 sdc1[0](S) sdf1[5] sdb1[4] sda1[3] sde1[2] sdd1[1]
976783616 blocks level 5, 32k chunk, algorithm 2 [5/4] [_UUUU]
[>....................] recovery = 0.0% (237952/244195904)
finish=427.0min speed=9518K/sec
and here is my current mdstat:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4]
[raid10]
md1 : active raid5 sdc1[5](S) sdf1[6](S) sdb1[4] sda1[3] sde1[7](F) sdd1[1]
976783616 blocks level 5, 32k chunk, algorithm 2 [5/3] [_U_UU]
sde is the disk that is now marked as failed, and which I would like to put
back into service.
Also, what does the number in []'s mean after each device, and why did that
number change on sdc, sde, and sdf?
Thanks, Frank
_________________________________________________________________
Get a preview of Live Earth, the hottest event this summer - only on MSN
http://liveearth.msn.com?source=msntaglineliveearthhm
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html