On 05/04/11 00:49, Roberto Spadim wrote:
i don´t know but this happened with me on a hp server, with linux
2,6,37 i changed kernel to a older release and the problem ended,
check with neil and others md guys what´s the real problem
maybe realtime module and others changes inside kernel are the
problem, maybe not...
just a quick solution idea: try a older kernel
Quick precis:
- Started reshape 512k to 64k chunk size.
- sdd got bad sector and was kicked.
- Array froze all IO.
- Reboot required to get system back.
- Restarted reshape with 9 drives.
- sdl suffered IO error and was kicked
- Array froze all IO.
- Reboot required to get system back.
- Array will no longer mount with 8/10 drives.
- Mdadm 3.1.5 segfaults when trying to start reshape.
Naively tried to run it under gdb to get a backtrace but was unable
to stop it forking
- Got array started with mdadm 3.2.1
- Attempted to re-add sdd/sdl (now marked as spares)
root@srv:~/mdadm-3.1.5# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid6 sdl[1](S) sdd[6](S) sdc[0] sdh[9] sda[8] sde[7]
sdg[5] sdb[4] sdf[3] sdm[2]
7814078464 blocks super 1.2 level 6, 512k chunk, algorithm 2
[10/8] [U_UUUU_UUU]
resync=DELAYED
md2 : active raid5 sdi[0] sdk[3] sdj[1]
1465146368 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3]
[UUU]
md6 : active raid1 sdo6[0] sdn6[1]
821539904 blocks [2/2] [UU]
md5 : active raid1 sdo5[0] sdn5[1]
104864192 blocks [2/2] [UU]
md4 : active raid1 sdo3[0] sdn3[1]
20980800 blocks [2/2] [UU]
md3 : active (auto-read-only) raid1 sdo2[0] sdn2[1]
8393856 blocks [2/2] [UU]
md1 : active raid1 sdo1[0] sdn1[1]
20980736 blocks [2/2] [UU]
unused devices: <none>
[ 303.640776] md: bind<sdl>
[ 303.677461] md: bind<sdm>
[ 303.837358] md: bind<sdf>
[ 303.846291] md: bind<sdb>
[ 303.851476] md: bind<sdg>
[ 303.860725] md: bind<sdd>
[ 303.861055] md: bind<sde>
[ 303.861982] md: bind<sda>
[ 303.862830] md: bind<sdh>
[ 303.863128] md: bind<sdc>
[ 303.863306] md: kicking non-fresh sdd from array!
[ 303.863353] md: unbind<sdd>
[ 303.900207] md: export_rdev(sdd)
[ 303.900260] md: kicking non-fresh sdl from array!
[ 303.900306] md: unbind<sdl>
[ 303.940100] md: export_rdev(sdl)
[ 303.942181] md/raid:md0: reshape will continue
[ 303.942242] md/raid:md0: device sdc operational as raid disk 0
[ 303.942285] md/raid:md0: device sdh operational as raid disk 9
[ 303.942327] md/raid:md0: device sda operational as raid disk 8
[ 303.942368] md/raid:md0: device sde operational as raid disk 7
[ 303.942409] md/raid:md0: device sdg operational as raid disk 5
[ 303.942449] md/raid:md0: device sdb operational as raid disk 4
[ 303.942490] md/raid:md0: device sdf operational as raid disk 3
[ 303.942531] md/raid:md0: device sdm operational as raid disk 2
[ 303.943733] md/raid:md0: allocated 10572kB
[ 303.943866] md/raid:md0: raid level 6 active with 8 out of 10
devices, algorithm 2
[ 303.943912] RAID conf printout:
[ 303.943916] --- level:6 rd:10 wd:8
[ 303.943920] disk 0, o:1, dev:sdc
[ 303.943924] disk 2, o:1, dev:sdm
[ 303.943927] disk 3, o:1, dev:sdf
[ 303.943931] disk 4, o:1, dev:sdb
[ 303.943934] disk 5, o:1, dev:sdg
[ 303.943938] disk 7, o:1, dev:sde
[ 303.943941] disk 8, o:1, dev:sda
[ 303.943945] disk 9, o:1, dev:sdh
[ 303.944061] md0: detected capacity change from 0 to 8001616347136
[ 303.944366] md: md0 switched to read-write mode.
[ 303.944427] md: reshape of RAID array md0
[ 303.944469] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
[ 303.944511] md: using maximum available idle IO bandwidth (but not
more than 200000 KB/sec) for reshape.
[ 303.944573] md: using 128k window, over a total of 976759808 blocks.
[ 304.054875] md0: unknown partition table
[ 304.393245] mdadm[5940]: segfault at 7f2000 ip 00000000004480d2 sp
00007fffa04777b8 error 4 in mdadm[400000+64000]
root@srv:~# mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Sat Jan 8 11:25:17 2011
Raid Level : raid6
Array Size : 7814078464 (7452.09 GiB 8001.62 GB)
Used Dev Size : 976759808 (931.51 GiB 1000.20 GB)
Raid Devices : 10
Total Devices : 10
Persistence : Superblock is persistent
Update Time : Tue Apr 5 07:54:30 2011
State : active, degraded
Active Devices : 8
Working Devices : 10
Failed Devices : 0
Spare Devices : 2
Layout : left-symmetric
Chunk Size : 512K
New Chunksize : 64K
Name : srv:server (local to host srv)
UUID : d00a11d7:fe0435af:07c8d4d6:e3b8e34e
Events : 633835
Number Major Minor RaidDevice State
0 8 32 0 active sync /dev/sdc
1 0 0 1 removed
2 8 192 2 active sync /dev/sdm
3 8 80 3 active sync /dev/sdf
4 8 16 4 active sync /dev/sdb
5 8 96 5 active sync /dev/sdg
6 0 0 6 removed
7 8 64 7 active sync /dev/sde
8 8 0 8 active sync /dev/sda
9 8 112 9 active sync /dev/sdh
1 8 176 - spare /dev/sdl
6 8 48 - spare /dev/sdd
root@srv:~# for i in /dev/sd? ; do mdadm --examine $i ; done
/dev/sda:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x4
Array UUID : d00a11d7:fe0435af:07c8d4d6:e3b8e34e
Name : srv:server (local to host srv)
Creation Time : Sat Jan 8 11:25:17 2011
Raid Level : raid6
Raid Devices : 10
Avail Dev Size : 1953523120 (931.51 GiB 1000.20 GB)
Array Size : 15628156928 (7452.09 GiB 8001.62 GB)
Used Dev Size : 1953519616 (931.51 GiB 1000.20 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : 9beb9a0f:2a73328c:f0c17909:89da70fd
Reshape pos'n : 3437035520 (3277.81 GiB 3519.52 GB)
New Chunksize : 64K
Update Time : Tue Apr 5 07:54:30 2011
Checksum : c58ed095 - correct
Events : 633835
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 8
Array State : A.AAAA.AAA ('A' == active, '.' == missing)
/dev/sdb:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x4
Array UUID : d00a11d7:fe0435af:07c8d4d6:e3b8e34e
Name : srv:server (local to host srv)
Creation Time : Sat Jan 8 11:25:17 2011
Raid Level : raid6
Raid Devices : 10
Avail Dev Size : 1953523120 (931.51 GiB 1000.20 GB)
Array Size : 15628156928 (7452.09 GiB 8001.62 GB)
Used Dev Size : 1953519616 (931.51 GiB 1000.20 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : 75d997f8:d9372d90:c068755b:81c8206b
Reshape pos'n : 3437035520 (3277.81 GiB 3519.52 GB)
New Chunksize : 64K
Update Time : Tue Apr 5 07:54:30 2011
Checksum : 72321703 - correct
Events : 633835
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 4
Array State : A.AAAA.AAA ('A' == active, '.' == missing)
/dev/sdc:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x4
Array UUID : d00a11d7:fe0435af:07c8d4d6:e3b8e34e
Name : srv:server (local to host srv)
Creation Time : Sat Jan 8 11:25:17 2011
Raid Level : raid6
Raid Devices : 10
Avail Dev Size : 1953523120 (931.51 GiB 1000.20 GB)
Array Size : 15628156928 (7452.09 GiB 8001.62 GB)
Used Dev Size : 1953519616 (931.51 GiB 1000.20 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : 5738a232:85f23a16:0c7a9454:d770199c
Reshape pos'n : 3437035520 (3277.81 GiB 3519.52 GB)
New Chunksize : 64K
Update Time : Tue Apr 5 07:54:30 2011
Checksum : 5c61ea2e - correct
Events : 633835
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 0
Array State : A.AAAA.AAA ('A' == active, '.' == missing)
/dev/sdd:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x4
Array UUID : d00a11d7:fe0435af:07c8d4d6:e3b8e34e
Name : srv:server (local to host srv)
Creation Time : Sat Jan 8 11:25:17 2011
Raid Level : raid6
Raid Devices : 10
Avail Dev Size : 1953523120 (931.51 GiB 1000.20 GB)
Array Size : 15628156928 (7452.09 GiB 8001.62 GB)
Used Dev Size : 1953519616 (931.51 GiB 1000.20 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : 83a2c731:ba2846d0:2ce97d83:de624339
Reshape pos'n : 3437035520 (3277.81 GiB 3519.52 GB)
New Chunksize : 64K
Update Time : Tue Apr 5 07:54:30 2011
Checksum : e1a5ebbc - correct
Events : 633835
Layout : left-symmetric
Chunk Size : 512K
Device Role : spare
Array State : A.AAAA.AAA ('A' == active, '.' == missing)
/dev/sde:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x4
Array UUID : d00a11d7:fe0435af:07c8d4d6:e3b8e34e
Name : srv:server (local to host srv)
Creation Time : Sat Jan 8 11:25:17 2011
Raid Level : raid6
Raid Devices : 10
Avail Dev Size : 1953523120 (931.51 GiB 1000.20 GB)
Array Size : 15628156928 (7452.09 GiB 8001.62 GB)
Used Dev Size : 1953519616 (931.51 GiB 1000.20 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : f1e3c1d3:ea9dc52e:a4e6b70e:e25a0321
Reshape pos'n : 3437035520 (3277.81 GiB 3519.52 GB)
New Chunksize : 64K
Update Time : Tue Apr 5 07:54:30 2011
Checksum : 551997d7 - correct
Events : 633835
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 7
Array State : A.AAAA.AAA ('A' == active, '.' == missing)
/dev/sdf:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x4
Array UUID : d00a11d7:fe0435af:07c8d4d6:e3b8e34e
Name : srv:server (local to host srv)
Creation Time : Sat Jan 8 11:25:17 2011
Raid Level : raid6
Raid Devices : 10
Avail Dev Size : 1953523120 (931.51 GiB 1000.20 GB)
Array Size : 15628156928 (7452.09 GiB 8001.62 GB)
Used Dev Size : 1953519616 (931.51 GiB 1000.20 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : c32dff71:0b8c165c:9f589b0f:bcbc82da
Reshape pos'n : 3437035520 (3277.81 GiB 3519.52 GB)
New Chunksize : 64K
Update Time : Tue Apr 5 07:54:30 2011
Checksum : db0aa39b - correct
Events : 633835
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 3
Array State : A.AAAA.AAA ('A' == active, '.' == missing)
/dev/sdg:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x4
Array UUID : d00a11d7:fe0435af:07c8d4d6:e3b8e34e
Name : srv:server (local to host srv)
Creation Time : Sat Jan 8 11:25:17 2011
Raid Level : raid6
Raid Devices : 10
Avail Dev Size : 1953523120 (931.51 GiB 1000.20 GB)
Array Size : 15628156928 (7452.09 GiB 8001.62 GB)
Used Dev Size : 1953519616 (931.51 GiB 1000.20 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : 194bc75c:97d3f507:4915b73a:51a50172
Reshape pos'n : 3437035520 (3277.81 GiB 3519.52 GB)
New Chunksize : 64K
Update Time : Tue Apr 5 07:54:30 2011
Checksum : 344cadbe - correct
Events : 633835
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 5
Array State : A.AAAA.AAA ('A' == active, '.' == missing)
/dev/sdh:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x4
Array UUID : d00a11d7:fe0435af:07c8d4d6:e3b8e34e
Name : srv:server (local to host srv)
Creation Time : Sat Jan 8 11:25:17 2011
Raid Level : raid6
Raid Devices : 10
Avail Dev Size : 1953523120 (931.51 GiB 1000.20 GB)
Array Size : 15628156928 (7452.09 GiB 8001.62 GB)
Used Dev Size : 1953519616 (931.51 GiB 1000.20 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : 1326457e:4fc0a6be:0073ccae:398d5c7f
Reshape pos'n : 3437035520 (3277.81 GiB 3519.52 GB)
New Chunksize : 64K
Update Time : Tue Apr 5 07:54:30 2011
Checksum : 8debbb14 - correct
Events : 633835
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 9
Array State : A.AAAA.AAA ('A' == active, '.' == missing)
/dev/sdi:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : e39d73c3:75be3b52:44d195da:b240c146
Name : srv:2 (local to host srv)
Creation Time : Sat Jul 10 21:14:29 2010
Raid Level : raid5
Raid Devices : 3
Avail Dev Size : 1465147120 (698.64 GiB 750.16 GB)
Array Size : 2930292736 (1397.27 GiB 1500.31 GB)
Used Dev Size : 1465146368 (698.64 GiB 750.15 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : b577b308:56f2e4c9:c78175f4:cf10c77f
Update Time : Tue Apr 5 07:46:18 2011
Checksum : 57ee683f - correct
Events : 455775
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 0
Array State : AAA ('A' == active, '.' == missing)
/dev/sdj:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : e39d73c3:75be3b52:44d195da:b240c146
Name : srv:2 (local to host srv)
Creation Time : Sat Jul 10 21:14:29 2010
Raid Level : raid5
Raid Devices : 3
Avail Dev Size : 1465147120 (698.64 GiB 750.16 GB)
Array Size : 2930292736 (1397.27 GiB 1500.31 GB)
Used Dev Size : 1465146368 (698.64 GiB 750.15 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : b127f002:a4aa8800:735ef8d7:6018564e
Update Time : Tue Apr 5 07:46:18 2011
Checksum : 3ae0b4c6 - correct
Events : 455775
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 1
Array State : AAA ('A' == active, '.' == missing)
/dev/sdk:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : e39d73c3:75be3b52:44d195da:b240c146
Name : srv:2 (local to host srv)
Creation Time : Sat Jul 10 21:14:29 2010
Raid Level : raid5
Raid Devices : 3
Avail Dev Size : 1465147120 (698.64 GiB 750.16 GB)
Array Size : 2930292736 (1397.27 GiB 1500.31 GB)
Used Dev Size : 1465146368 (698.64 GiB 750.15 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 90fddf63:03d5dba4:3fcdc476:9ce3c44c
Update Time : Tue Apr 5 07:46:18 2011
Checksum : dd5eef0e - correct
Events : 455775
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 2
Array State : AAA ('A' == active, '.' == missing)
/dev/sdl:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x4
Array UUID : d00a11d7:fe0435af:07c8d4d6:e3b8e34e
Name : srv:server (local to host srv)
Creation Time : Sat Jan 8 11:25:17 2011
Raid Level : raid6
Raid Devices : 10
Avail Dev Size : 1953523120 (931.51 GiB 1000.20 GB)
Array Size : 15628156928 (7452.09 GiB 8001.62 GB)
Used Dev Size : 1953519616 (931.51 GiB 1000.20 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : 769940af:66733069:37cea27d:7fb28a23
Reshape pos'n : 3437035520 (3277.81 GiB 3519.52 GB)
New Chunksize : 64K
Update Time : Tue Apr 5 07:54:30 2011
Checksum : dc756202 - correct
Events : 633835
Layout : left-symmetric
Chunk Size : 512K
Device Role : spare
Array State : A.AAAA.AAA ('A' == active, '.' == missing)
/dev/sdm:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x4
Array UUID : d00a11d7:fe0435af:07c8d4d6:e3b8e34e
Name : srv:server (local to host srv)
Creation Time : Sat Jan 8 11:25:17 2011
Raid Level : raid6
Raid Devices : 10
Avail Dev Size : 1953523120 (931.51 GiB 1000.20 GB)
Array Size : 15628156928 (7452.09 GiB 8001.62 GB)
Used Dev Size : 1953519616 (931.51 GiB 1000.20 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : active
Device UUID : 7e564e2c:7f21125b:c3b1907a:b640178f
Reshape pos'n : 3437035520 (3277.81 GiB 3519.52 GB)
New Chunksize : 64K
Update Time : Tue Apr 5 07:54:30 2011
Checksum : b3df3ee7 - correct
Events : 633835
Layout : left-symmetric
Chunk Size : 512K
Device Role : Active device 2
Array State : A.AAAA.AAA ('A' == active, '.' == missing)
root@srv:~/mdadm-3.1.5# ./mdadm --version
mdadm - v3.1.5 - 23rd March 2011
root@srv:~/mdadm-3.1.5# uname -a
Linux srv 2.6.38 #19 SMP Wed Mar 23 09:57:05 WST 2011 x86_64 GNU/Linux
Now. The array restarted with mdadm 3.2.1, but of course its now
reshaping 8 out of 10 disks, has no redundancy and is going at 600k/s
which will take over 10 days. Is there anything I can do to give it some
redundancy while it completes or am I better to copy the data off, blow
it away and start again? All the important stuff is backed up anyway, I
just wanted to avoid restoring 8TB from backup if I could.
Regards,
Brad
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html