List, good evening,
We run a 2TB fileserver in a raid1 configuration. Today one of the 2
disks (/dev/sdb) failed and we've just replaced it and set up exactly
the same partitions as the working, but degraded, raid has on /dev/sda.
Using the commands
# mdadm --manage -a /dev/mdo /dev/sdb1
(and so on for md 1->7)
is resulting in a very-unusually slow recovery. And mdadm is now
recovering the largest partition, 1.8TB, but expects to spend 5 days
over it. I think I must have done something wrong. May I ask a
couple of questions?
1 Is there a safe command to stop the recovery/add process that is
ongoing? I reread man mdadm but did not see a command I could use for
this.
2 After the failure of /dev/sdb, mdstat listed sdb x in each md
device with an '(F)'. We then also 'FAIL'ed each sdb partition in
each md device, and then powered down the machine to replace sdb.
After powering up and booting back into Debian, we created the
partitions on (the new) sdb to mirror those on /dev/sda. We then
issued these commands one after the other:
# mdadm --manage -a /dev/mdo /dev/sdb1
# mdadm --manage -a /dev/md1 /dev/sdb2
# mdadm --manage -a /dev/md2 /dev/sdb3
# mdadm --manage -a /dev/md3 /dev/sdb5
# mdadm --manage -a /dev/md4 /dev/sdb6
# mdadm --manage -a /dev/md5 /dev/sdb7
# mdadm --manage -a /dev/md6 /dev/sdb8
# mdadm --manage -a /dev/md7 /dev/sdb9
Have I missed some vital step, and so causing the recover process to
take a very long time?
mdstat and lsdrv outputs here (UUIDs abbreviated):
# cat /proc/mdstat
Personalities : [raid1]
md7 : active raid1 sdb9[3] sda9[2]
1894416248 blocks super 1.2 [2/1] [U_]
[>....................] recovery = 0.0% (1493504/1894416248)
finish=7248.4min speed=4352K/sec
md6 : active raid1 sdb8[3] sda8[2]
39060408 blocks super 1.2 [2/1] [U_]
resync=DELAYED
md5 : active raid1 sdb7[3] sda7[2]
975860 blocks super 1.2 [2/1] [U_]
resync=DELAYED
md4 : active raid1 sdb6[3] sda6[2]
975860 blocks super 1.2 [2/1] [U_]
resync=DELAYED
md3 : active raid1 sdb5[3] sda5[2]
4880372 blocks super 1.2 [2/1] [U_]
resync=DELAYED
md2 : active raid1 sdb3[3] sda3[2]
9764792 blocks super 1.2 [2/1] [U_]
resync=DELAYED
md1 : active raid1 sdb2[3] sda2[2]
2928628 blocks super 1.2 [2/2] [UU]
md0 : active raid1 sdb1[3] sda1[2]
498676 blocks super 1.2 [2/2] [UU]
unused devices: <none>
I meant to also ask - why are the /dev/sdb partitions shown with a
'(3)'? Previously I think they had a '(1)'.
# ./lsdrv
**Warning** The following utility(ies) failed to execute:
sginfo
pvs
lvs
Some information may be missing.
Controller platform [None]
└platform floppy.0
└fd0 4.00k [2:0] Empty/Unknown
PCI [sata_nv] 00:08.0 IDE interface: nVidia Corporation MCP61 SATA
Controller (rev a2)
├scsi 0:0:0:0 ATA WDC WD20EZRX-00D {WD-WC....R1}
│└sda 1.82t [8:0] Partitioned (dos)
│ ├sda1 487.00m [8:1] MD raid1 (0/2) (w/ sdb1) in_sync 'Server6:0'
{b307....e950}
│ │└md0 486.99m [9:0] MD v1.2 raid1 (2) clean {b307....e950}
│ │ │ ext2 {4ed1....e8b1}
│ │ └Mounted as /dev/md0 @ /boot
│ ├sda2 2.79g [8:2] MD raid1 (0/2) (w/ sdb2) in_sync 'Server6:1'
{77b1....50f2}
│ │└md1 2.79g [9:1] MD v1.2 raid1 (2) clean {77b1....50f2}
│ │ │ jfs {7d08....bae5}
│ │ └Mounted as /dev/disk/by-uuid/7d08....bae5 @ /
│ ├sda3 9.31g [8:3] MD raid1 (0/2) (w/ sdb3) in_sync 'Server6:2'
{afd6....b694}
│ │└md2 9.31g [9:2] MD v1.2 raid1 (2) clean DEGRADED, recover
(0.00k/18.62g) 0.00k/sec {afd6....b694}
│ │ │ jfs {81bb....92f8}
│ │ └Mounted as /dev/md2 @ /usr
│ ├sda4 1.00k [8:4] Partitioned (dos)
│ ├sda5 4.66g [8:5] MD raid1 (0/2) (w/ sdb5) in_sync 'Server6:3'
{d00a....4e99}
│ │└md3 4.65g [9:3] MD v1.2 raid1 (2) active DEGRADED, recover
(0.00k/9.31g) 0.00k/sec {d00a....4e99}
│ │ │ jfs {375b....4fd5}
│ │ └Mounted as /dev/md3 @ /var
│ ├sda6 953.00m [8:6] MD raid1 (0/2) (w/ sdb6) in_sync 'Server6:4'
{25af....d910}
│ │└md4 952.99m [9:4] MD v1.2 raid1 (2) clean DEGRADED, recover
(0.00k/1.86g) 0.00k/sec {25af....d910}
│ │ swap {d92f....2ad7}
│ ├sda7 953.00m [8:7] MD raid1 (0/2) (w/ sdb7) in_sync 'Server6:5'
{0034....971a}
│ │└md5 952.99m [9:5] MD v1.2 raid1 (2) active DEGRADED, recover
(0.00k/1.86g) 0.00k/sec {0034....971a}
│ │ │ jfs {4bf7....0fff}
│ │ └Mounted as /dev/md5 @ /tmp
│ ├sda8 37.25g [8:8] MD raid1 (0/2) (w/ sdb8) in_sync 'Server6:6'
{a5d9....568d}
│ │└md6 37.25g [9:6] MD v1.2 raid1 (2) clean DEGRADED, recover
(0.00k/74.50g) 0.00k/sec {a5d9....568d}
│ │ │ jfs {fdf0....6478}
│ │ └Mounted as /dev/md6 @ /home
│ └sda9 1.76t [8:9] MD raid1 (0/2) (w/ sdb9) in_sync 'Server6:7'
{9bb1....bbb4}
│ └md7 1.76t [9:7] MD v1.2 raid1 (2) clean DEGRADED, recover
(0.00k/3.53t) 3.01m/sec {9bb1....bbb4}
│ │ jfs {60bc....33fc}
│ └Mounted as /dev/md7 @ /srv
└scsi 1:0:0:0 ATA ST2000DL003-9VT1 {5Y....HT}
└sdb 1.82t [8:16] Partitioned (dos)
├sdb1 487.00m [8:17] MD raid1 (1/2) (w/ sda1) in_sync 'Server6:0'
{b307....e950}
│└md0 486.99m [9:0] MD v1.2 raid1 (2) clean {b307....e950}
│ ext2 {4ed1....e8b1}
├sdb2 2.79g [8:18] MD raid1 (1/2) (w/ sda2) in_sync 'Server6:1'
{77b1....50f2}
│└md1 2.79g [9:1] MD v1.2 raid1 (2) clean {77b1....50f2}
│ jfs {7d08....bae5}
├sdb3 9.31g [8:19] MD raid1 (1/2) (w/ sda3) spare 'Server6:2'
{afd6....b694}
│└md2 9.31g [9:2] MD v1.2 raid1 (2) clean DEGRADED, recover
(0.00k/18.62g) 0.00k/sec {afd6....b694}
│ jfs {81bb....92f8}
├sdb4 1.00k [8:20] Partitioned (dos)
├sdb5 4.66g [8:21] MD raid1 (1/2) (w/ sda5) spare 'Server6:3'
{d00a....4e99}
│└md3 4.65g [9:3] MD v1.2 raid1 (2) active DEGRADED, recover
(0.00k/9.31g) 0.00k/sec {d00a....4e99}
│ jfs {375b....4fd5}
├sdb6 953.00m [8:22] MD raid1 (1/2) (w/ sda6) spare 'Server6:4'
{25af....d910}
│└md4 952.99m [9:4] MD v1.2 raid1 (2) clean DEGRADED, recover
(0.00k/1.86g) 0.00k/sec {25af....d910}
│ swap {d92f....2ad7}
├sdb7 953.00m [8:23] MD raid1 (1/2) (w/ sda7) spare 'Server6:5'
{0034....971a}
│└md5 952.99m [9:5] MD v1.2 raid1 (2) active DEGRADED, recover
(0.00k/1.86g) 0.00k/sec {0034....971a}
│ jfs {4bf7....0fff}
├sdb8 37.25g [8:24] MD raid1 (1/2) (w/ sda8) spare 'Server6:6'
{a5d9....568d}
│└md6 37.25g [9:6] MD v1.2 raid1 (2) clean DEGRADED, recover
(0.00k/74.50g) 0.00k/sec {a5d9....568d}
│ jfs {fdf0....6478}
├sdb9 1.76t [8:25] MD raid1 (1/2) (w/ sda9) spare 'Server6:7'
{9bb1....bbb4}
│└md7 1.76t [9:7] MD v1.2 raid1 (2) clean DEGRADED, recover
(0.00k/3.53t) 3.01m/sec {9bb1....bbb4}
│ jfs {60bc....33fc}
└sdb10 1.00m [8:26] Empty/Unknown
PCI [pata_amd] 00:06.0 IDE interface: nVidia Corporation MCP61 IDE
(rev a2)
├scsi 2:0:0:0 AOPEN CD-RW CRW5224
{AOPEN_CD-RW_CRW5224_1.07_20020606_}
│└sr0 1.00g [11:0] Empty/Unknown
└scsi 3:x:x:x [Empty]
Other Block Devices
├loop0 0.00k [7:0] Empty/Unknown
├loop1 0.00k [7:1] Empty/Unknown
├loop2 0.00k [7:2] Empty/Unknown
├loop3 0.00k [7:3] Empty/Unknown
├loop4 0.00k [7:4] Empty/Unknown
├loop5 0.00k [7:5] Empty/Unknown
├loop6 0.00k [7:6] Empty/Unknown
└loop7 0.00k [7:7] Empty/Unknown
OS is still as originally installed some years ago - Debian 6/Squeeze.
The OS has been pretty solid, though we've had to renew disks
previously but without this very slow recovery.
I'd be very grateful for any thoughts.
regards, Ron
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html