I don't mean to be rude, but it's been two weeks and my system is still
in this state. Bump, anyone?
A thorough search of the web (before I originally posted this to the
list) revealed nothing. No explanation as to why this occurs seemed
apparent, only that it's happened a number of times. Most reports
indicate that a complete stop of the array and reassemble fixes it, but
I tried that and it still returned to spare. Some reports indicated my
position but no response that seems complete.
Eventually the discussions runs to wiping the disks and starting again.
That seems a bit drastic and I'm concerned that *one* of the disks is
faulty but not being reported as such, and I don't want to pick the
wrong one to wipe off the superblock. mdadm reports no errors, but
SMART indicates there may be a problem with the *active* disk, which is
even more worrying because without making the spare active I can't
remove it to test it properly.
Any ideas?
Cheers,
Tudor.
On 03/12/12 11:04, Tudor Holton wrote:
Hallo,
I'm having some trouble with an array I have that has become degraded.
I have an array with this array state:
md101 : active raid1 sdf1[0] sdb1[2](S)
1953511936 blocks [2/1] [U_]
mdadm --detail says:
/dev/md101:
Version : 0.90
Creation Time : Thu Jan 13 14:34:27 2011
Raid Level : raid1
Array Size : 1953511936 (1863.01 GiB 2000.40 GB)
Used Dev Size : 1953511936 (1863.01 GiB 2000.40 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 101
Persistence : Superblock is persistent
Update Time : Fri Nov 23 03:23:04 2012
State : clean, degraded
Active Devices : 1
Working Devices : 2
Failed Devices : 0
Spare Devices : 1
UUID : 43e92a79:90295495:0a76e71e:56c99031 (local to host
barney)
Events : 0.2127
Number Major Minor RaidDevice State
0 8 81 0 active sync /dev/sdf1
1 0 0 1 removed
2 8 17 - spare /dev/sdb1
If I attempt to force the spare to become active it begins to recover:
$ sudo mdadm -S /dev/md101
mdadm: stopped /dev/md101
$ sudo mdadm --assemble --force --no-degraded /dev/md101 /dev/sdf1
/dev/sdb1
mdadm: /dev/md101 has been started with 1 drive (out of 2) and 1 spare.
$ cat /proc/mdstat
md101 : active raid1 sdf1[0] sdb1[2]
1953511936 blocks [2/1] [U_]
[>....................] recovery = 0.0% (541440/1953511936)
finish=420.8min speed=77348K/sec
This runs for the allotted time but returns to the state of spare.
Neither disk partition report errors:
$ cat /sys/block/md101/md/dev-sdf1/errors
0
$ cat /sys/block/md101/md/dev-sdb1/errors
0
Are there mdadm logs to find out why this is not recovering properly?
How otherwise do I debug this?
Cheers,
Tudor.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html