Hi guys, So recently I had a hard drive go down with some unusual behaviour I thought I'd report. Since this is a production machine and I can't really replicate it, I've tired to be as detailed about the situation as I can. In a nutshell, after some odd errors (which I think originated in the SATA code) the array degraded. I let it rebuild to a hot spare, but upon reboot it started rebuilding again even though the spare checked out as ok. I let rebuild again, but upon reboot it rebuilt again. This behaviour occured spanned 2.6.12-ck3s and 2.6.15.1 (after the first weird rebuild, I did a kernel upgrade thinking the bug may have been fixed). I ended up having to swap the hot-spare to the old drives position on the SATA controller (i.e. put sdo where sdj used to be). Everything was groovy from then on out. This array is normally dormant (very light duty). I was doing some heavy io when the errors came up, but I think this may have been the result of a bug in the SATA stack, because after the kernel upgrade the drives were a *LOT* quieter. A kernel upgrade shouldn't really do that. In any case, I have included the relevant dmesgs and a --examine and --detail for all drives as soon as the second (2.6.15.1) weird rebuild started. If you need any more info I'll do my best to provide it, but I thought I should at least report this. Neil P.S. I realise in the files below, md3 is also degraded. sdg died after the first rebuild but before the reboot, due to me running my program again (md5s of all files on the array) and subsequently tripping the alleged SATA bug. --- dmesg below --detail and --examine files attached along with an unhappy mdstat Random Info: CPU - AMD Athlon(TM) XP 2500+ Memory: 512M SATA cards: 3 x SATA 3114 (md3 and md4) Drives: 6 x Maxtor 6Y200M0 (md3) and 6 x Maxtor 7L300S0 (md4) In this dmesg, sdo should *not* have been booted out of the array. md: autorun ... md: considering sdo1 ... md: adding sdo1 ... md: adding sdn1 ... md: adding sdm1 ... md: adding sdl1 ... md: adding sdk1 ... md: adding sdj1 ... md: adding sdi1 ... md: sdh1 has different UUID to sdo1 md: sdg1 has different UUID to sdo1 md: sdf1 has different UUID to sdo1 md: sde1 has different UUID to sdo1 md: sdd1 has different UUID to sdo1 md: sdc1 has different UUID to sdo1 md: sdb3 has different UUID to sdo1 md: sdb2 has different UUID to sdo1 md: sdb1 has different UUID to sdo1 md: sda3 has different UUID to sdo1 md: sda2 has different UUID to sdo1 md: sda1 has different UUID to sdo1 devfs_mk_dev: could not append to parent for md/4 md: created md4 md: bind<sdi1> md: bind<sdj1> md: bind<sdk1> md: bind<sdl1> md: bind<sdm1> md: bind<sdn1> md: export_rdev(sdo1) md: running: <sdn1><sdm1><sdl1><sdk1><sdj1><sdi1> md: kicking non-fresh sdj1 from array! md: unbind<sdj1> md: export_rdev(sdj1) raid5: device sdn1 operational as raid disk 5 raid5: device sdm1 operational as raid disk 0 raid5: device sdl1 operational as raid disk 1 raid5: device sdk1 operational as raid disk 2 raid5: device sdi1 operational as raid disk 4 raid5: allocated 6290kB for md4 raid5: raid level 5 set md4 active with 5 out of 6 devices, algorithm 2 --- rd:6 wd:5 fd:1 disk 0, o:1, dev:sdm1 disk 1, o:1, dev:sdl1 disk 2, o:1, dev:sdk1 disk 4, o:1, dev:sdi1 disk 5, o:1, dev:sdn1
/dev/md0: Version : 00.90.01 Creation Time : Sun Jul 3 01:07:32 2005 Raid Level : raid1 Array Size : 4883648 (4.66 GiB 5.00 GB) Device Size : 4883648 (4.66 GiB 5.00 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Tue Jan 24 12:45:59 2006 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : d7e4fbc0:fb23ed40:4cb15b40:45319463 Events : 0.1120516 Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/sda1 1 8 17 1 active sync /dev/sdb1 /dev/md1: Version : 00.90.01 Creation Time : Sun Jul 3 01:07:39 2005 Raid Level : raid1 Array Size : 1951808 (1.86 GiB 2.00 GB) Device Size : 1951808 (1.86 GiB 2.00 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 1 Persistence : Superblock is persistent Update Time : Tue Jan 24 13:38:45 2006 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : e2e9d49f:a777210d:015eeea0:c259ad00 Events : 0.855 Number Major Minor RaidDevice State 0 8 2 0 active sync /dev/sda2 1 8 18 1 active sync /dev/sdb2 /dev/md2: Version : 00.90.01 Creation Time : Sun Jul 3 01:07:46 2005 Raid Level : raid1 Array Size : 73200064 (69.81 GiB 74.96 GB) Device Size : 73200064 (69.81 GiB 74.96 GB) Raid Devices : 2 Total Devices : 2 Preferred Minor : 2 Persistence : Superblock is persistent Update Time : Tue Jan 24 12:45:59 2006 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 UUID : 51ad8be1:d645a8b9:4e62dc45:468e23f5 Events : 0.8172300 Number Major Minor RaidDevice State 0 8 3 0 active sync /dev/sda3 1 8 19 1 active sync /dev/sdb3 /dev/md3: Version : 00.90.01 Creation Time : Sat Apr 24 13:28:57 2004 Raid Level : raid5 Array Size : 995708160 (949.58 GiB 1019.61 GB) Device Size : 199141632 (189.92 GiB 203.92 GB) Raid Devices : 6 Total Devices : 5 Preferred Minor : 3 Persistence : Superblock is persistent Update Time : Tue Jan 24 13:35:21 2006 State : clean, degraded Active Devices : 5 Working Devices : 5 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 256K UUID : 090b8145:76a7193c:060bb39c:1e874db7 Events : 0.40600763 Number Major Minor RaidDevice State 0 8 33 0 active sync /dev/sdc1 1 8 81 1 active sync /dev/sdf1 2 0 0 - removed 3 8 65 3 active sync /dev/sde1 4 8 113 4 active sync /dev/sdh1 5 8 49 5 active sync /dev/sdd1 /dev/md4: Version : 00.90.01 Creation Time : Mon Jul 4 19:43:00 2005 Raid Level : raid5 Array Size : 1465248000 (1397.37 GiB 1500.41 GB) Device Size : 293049600 (279.47 GiB 300.08 GB) Raid Devices : 6 Total Devices : 5 Preferred Minor : 4 Persistence : Superblock is persistent Update Time : Tue Jan 24 13:35:21 2006 State : clean, degraded Active Devices : 5 Working Devices : 5 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 256K UUID : 55c1c152:76de5d6a:5d5399ef:2fa00728 Events : 0.3230512 Number Major Minor RaidDevice State 0 8 193 0 active sync /dev/sdm1 1 8 177 1 active sync /dev/sdl1 2 8 161 2 active sync /dev/sdk1 3 0 0 - removed 4 8 129 4 active sync /dev/sdi1 5 8 209 5 active sync /dev/sdn1
/dev/sda1: Magic : a92b4efc Version : 00.90.00 UUID : d7e4fbc0:fb23ed40:4cb15b40:45319463 Creation Time : Sun Jul 3 01:07:32 2005 Raid Level : raid1 Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Update Time : Tue Jan 24 12:45:44 2006 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Checksum : 95203db1 - correct Events : 0.1120512 Number Major Minor RaidDevice State this 0 8 1 0 active sync /dev/sda1 0 0 8 1 0 active sync /dev/sda1 1 1 8 17 1 active sync /dev/sdb1 /dev/sdb1: Magic : a92b4efc Version : 00.90.00 UUID : d7e4fbc0:fb23ed40:4cb15b40:45319463 Creation Time : Sun Jul 3 01:07:32 2005 Raid Level : raid1 Raid Devices : 2 Total Devices : 2 Preferred Minor : 0 Update Time : Tue Jan 24 12:45:44 2006 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Checksum : 95203dc3 - correct Events : 0.1120512 Number Major Minor RaidDevice State this 1 8 17 1 active sync /dev/sdb1 0 0 8 1 0 active sync /dev/sda1 1 1 8 17 1 active sync /dev/sdb1 /dev/sdc1: Magic : a92b4efc Version : 00.90.00 UUID : 090b8145:76a7193c:060bb39c:1e874db7 Creation Time : Sat Apr 24 13:28:57 2004 Raid Level : raid5 Raid Devices : 6 Total Devices : 6 Preferred Minor : 3 Update Time : Tue Jan 24 13:35:21 2006 State : clean Active Devices : 5 Working Devices : 5 Failed Devices : 2 Spare Devices : 0 Checksum : e28a3bbd - correct Events : 0.40600763 Layout : left-symmetric Chunk Size : 256K Number Major Minor RaidDevice State this 0 8 33 0 active sync /dev/sdc1 0 0 8 33 0 active sync /dev/sdc1 1 1 8 81 1 active sync /dev/sdf1 2 2 0 0 2 faulty removed 3 3 8 65 3 active sync /dev/sde1 4 4 8 113 4 active sync /dev/sdh1 5 5 8 49 5 active sync /dev/sdd1 /dev/sdd1: Magic : a92b4efc Version : 00.90.00 UUID : 090b8145:76a7193c:060bb39c:1e874db7 Creation Time : Sat Apr 24 13:28:57 2004 Raid Level : raid5 Raid Devices : 6 Total Devices : 6 Preferred Minor : 3 Update Time : Tue Jan 24 13:35:21 2006 State : clean Active Devices : 5 Working Devices : 5 Failed Devices : 2 Spare Devices : 0 Checksum : e28a3bd7 - correct Events : 0.40600763 Layout : left-symmetric Chunk Size : 256K Number Major Minor RaidDevice State this 5 8 49 5 active sync /dev/sdd1 0 0 8 33 0 active sync /dev/sdc1 1 1 8 81 1 active sync /dev/sdf1 2 2 0 0 2 faulty removed 3 3 8 65 3 active sync /dev/sde1 4 4 8 113 4 active sync /dev/sdh1 5 5 8 49 5 active sync /dev/sdd1 /dev/sde1: Magic : a92b4efc Version : 00.90.00 UUID : 090b8145:76a7193c:060bb39c:1e874db7 Creation Time : Sat Apr 24 13:28:57 2004 Raid Level : raid5 Raid Devices : 6 Total Devices : 6 Preferred Minor : 3 Update Time : Tue Jan 24 13:35:21 2006 State : clean Active Devices : 5 Working Devices : 5 Failed Devices : 2 Spare Devices : 0 Checksum : e28a3be3 - correct Events : 0.40600763 Layout : left-symmetric Chunk Size : 256K Number Major Minor RaidDevice State this 3 8 65 3 active sync /dev/sde1 0 0 8 33 0 active sync /dev/sdc1 1 1 8 81 1 active sync /dev/sdf1 2 2 0 0 2 faulty removed 3 3 8 65 3 active sync /dev/sde1 4 4 8 113 4 active sync /dev/sdh1 5 5 8 49 5 active sync /dev/sdd1 /dev/sdf1: Magic : a92b4efc Version : 00.90.00 UUID : 090b8145:76a7193c:060bb39c:1e874db7 Creation Time : Sat Apr 24 13:28:57 2004 Raid Level : raid5 Raid Devices : 6 Total Devices : 6 Preferred Minor : 3 Update Time : Tue Jan 24 13:35:21 2006 State : clean Active Devices : 5 Working Devices : 5 Failed Devices : 2 Spare Devices : 0 Checksum : e28a3bef - correct Events : 0.40600763 Layout : left-symmetric Chunk Size : 256K Number Major Minor RaidDevice State this 1 8 81 1 active sync /dev/sdf1 0 0 8 33 0 active sync /dev/sdc1 1 1 8 81 1 active sync /dev/sdf1 2 2 0 0 2 faulty removed 3 3 8 65 3 active sync /dev/sde1 4 4 8 113 4 active sync /dev/sdh1 5 5 8 49 5 active sync /dev/sdd1 /dev/sdg1: Magic : a92b4efc Version : 00.90.00 UUID : 090b8145:76a7193c:060bb39c:1e874db7 Creation Time : Sat Apr 24 13:28:57 2004 Raid Level : raid5 Raid Devices : 6 Total Devices : 6 Preferred Minor : 3 Update Time : Tue Jan 24 12:45:45 2006 State : active Active Devices : 6 Working Devices : 6 Failed Devices : 0 Spare Devices : 0 Checksum : e01ea8cf - correct Events : 0.40600053 Layout : left-symmetric Chunk Size : 256K Number Major Minor RaidDevice State this 2 8 97 2 active sync /dev/sdg1 0 0 8 33 0 active sync /dev/sdc1 1 1 8 81 1 active sync /dev/sdf1 2 2 8 97 2 active sync /dev/sdg1 3 3 8 65 3 active sync /dev/sde1 4 4 8 113 4 active sync /dev/sdh1 5 5 8 49 5 active sync /dev/sdd1 /dev/sdh1: Magic : a92b4efc Version : 00.90.00 UUID : 090b8145:76a7193c:060bb39c:1e874db7 Creation Time : Sat Apr 24 13:28:57 2004 Raid Level : raid5 Raid Devices : 6 Total Devices : 6 Preferred Minor : 3 Update Time : Tue Jan 24 13:35:21 2006 State : clean Active Devices : 5 Working Devices : 5 Failed Devices : 2 Spare Devices : 0 Checksum : e28a3c15 - correct Events : 0.40600763 Layout : left-symmetric Chunk Size : 256K Number Major Minor RaidDevice State this 4 8 113 4 active sync /dev/sdh1 0 0 8 33 0 active sync /dev/sdc1 1 1 8 81 1 active sync /dev/sdf1 2 2 0 0 2 faulty removed 3 3 8 65 3 active sync /dev/sde1 4 4 8 113 4 active sync /dev/sdh1 5 5 8 49 5 active sync /dev/sdd1 /dev/sdi1: Magic : a92b4efc Version : 00.90.00 UUID : 55c1c152:76de5d6a:5d5399ef:2fa00728 Creation Time : Mon Jul 4 19:43:00 2005 Raid Level : raid5 Raid Devices : 6 Total Devices : 7 Preferred Minor : 4 Update Time : Tue Jan 24 13:35:21 2006 State : clean Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Checksum : 9b3c01e3 - correct Events : 0.3230512 Layout : left-symmetric Chunk Size : 256K Number Major Minor RaidDevice State this 4 8 129 4 active sync /dev/sdi1 0 0 8 193 0 active sync /dev/sdm1 1 1 8 177 1 active sync /dev/sdl1 2 2 8 161 2 active sync /dev/sdk1 3 3 8 225 3 active sync /dev/sdo1 4 4 8 129 4 active sync /dev/sdi1 5 5 8 209 5 active sync /dev/sdn1 /dev/sdj1: Magic : a92b4efc Version : 00.90.00 UUID : 55c1c152:76de5d6a:5d5399ef:2fa00728 Creation Time : Mon Jul 4 19:43:00 2005 Raid Level : raid5 Raid Devices : 6 Total Devices : 7 Preferred Minor : 4 Update Time : Tue Jan 24 02:13:42 2006 State : active Active Devices : 6 Working Devices : 7 Failed Devices : 0 Spare Devices : 1 Checksum : 9b0a08b2 - correct Events : 0.3226853 Layout : left-symmetric Chunk Size : 256K Number Major Minor RaidDevice State this 3 8 145 3 active sync /dev/sdj1 0 0 8 193 0 active sync /dev/sdm1 1 1 8 177 1 active sync /dev/sdl1 2 2 8 161 2 active sync /dev/sdk1 3 3 8 145 3 active sync /dev/sdj1 4 4 8 129 4 active sync /dev/sdi1 5 5 8 209 5 active sync /dev/sdn1 6 6 8 225 6 spare /dev/sdo1 /dev/sdk1: Magic : a92b4efc Version : 00.90.00 UUID : 55c1c152:76de5d6a:5d5399ef:2fa00728 Creation Time : Mon Jul 4 19:43:00 2005 Raid Level : raid5 Raid Devices : 6 Total Devices : 7 Preferred Minor : 4 Update Time : Tue Jan 24 13:35:21 2006 State : clean Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Checksum : 9b3c01ff - correct Events : 0.3230512 Layout : left-symmetric Chunk Size : 256K Number Major Minor RaidDevice State this 2 8 161 2 active sync /dev/sdk1 0 0 8 193 0 active sync /dev/sdm1 1 1 8 177 1 active sync /dev/sdl1 2 2 8 161 2 active sync /dev/sdk1 3 3 8 225 3 active sync /dev/sdo1 4 4 8 129 4 active sync /dev/sdi1 5 5 8 209 5 active sync /dev/sdn1 /dev/sdl1: Magic : a92b4efc Version : 00.90.00 UUID : 55c1c152:76de5d6a:5d5399ef:2fa00728 Creation Time : Mon Jul 4 19:43:00 2005 Raid Level : raid5 Raid Devices : 6 Total Devices : 7 Preferred Minor : 4 Update Time : Tue Jan 24 13:35:21 2006 State : clean Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Checksum : 9b3c020d - correct Events : 0.3230512 Layout : left-symmetric Chunk Size : 256K Number Major Minor RaidDevice State this 1 8 177 1 active sync /dev/sdl1 0 0 8 193 0 active sync /dev/sdm1 1 1 8 177 1 active sync /dev/sdl1 2 2 8 161 2 active sync /dev/sdk1 3 3 8 225 3 active sync /dev/sdo1 4 4 8 129 4 active sync /dev/sdi1 5 5 8 209 5 active sync /dev/sdn1 /dev/sdm1: Magic : a92b4efc Version : 00.90.00 UUID : 55c1c152:76de5d6a:5d5399ef:2fa00728 Creation Time : Mon Jul 4 19:43:00 2005 Raid Level : raid5 Raid Devices : 6 Total Devices : 7 Preferred Minor : 4 Update Time : Tue Jan 24 13:35:21 2006 State : clean Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Checksum : 9b3c021b - correct Events : 0.3230512 Layout : left-symmetric Chunk Size : 256K Number Major Minor RaidDevice State this 0 8 193 0 active sync /dev/sdm1 0 0 8 193 0 active sync /dev/sdm1 1 1 8 177 1 active sync /dev/sdl1 2 2 8 161 2 active sync /dev/sdk1 3 3 8 225 3 active sync /dev/sdo1 4 4 8 129 4 active sync /dev/sdi1 5 5 8 209 5 active sync /dev/sdn1 /dev/sdn1: Magic : a92b4efc Version : 00.90.00 UUID : 55c1c152:76de5d6a:5d5399ef:2fa00728 Creation Time : Mon Jul 4 19:43:00 2005 Raid Level : raid5 Raid Devices : 6 Total Devices : 7 Preferred Minor : 4 Update Time : Tue Jan 24 13:35:21 2006 State : clean Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Checksum : 9b3c0235 - correct Events : 0.3230512 Layout : left-symmetric Chunk Size : 256K Number Major Minor RaidDevice State this 5 8 209 5 active sync /dev/sdn1 0 0 8 193 0 active sync /dev/sdm1 1 1 8 177 1 active sync /dev/sdl1 2 2 8 161 2 active sync /dev/sdk1 3 3 8 225 3 active sync /dev/sdo1 4 4 8 129 4 active sync /dev/sdi1 5 5 8 209 5 active sync /dev/sdn1 /dev/sdo1: Magic : a92b4efc Version : 00.90.00 UUID : 55c1c152:76de5d6a:5d5399ef:2fa00728 Creation Time : Mon Jul 4 19:43:00 2005 Raid Level : raid5 Raid Devices : 6 Total Devices : 7 Preferred Minor : 4 Update Time : Tue Jan 24 13:35:21 2006 State : clean Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Checksum : 9b3c0241 - correct Events : 0.3230512 Layout : left-symmetric Chunk Size : 256K Number Major Minor RaidDevice State this 3 8 225 3 active sync /dev/sdo1 0 0 8 193 0 active sync /dev/sdm1 1 1 8 177 1 active sync /dev/sdl1 2 2 8 161 2 active sync /dev/sdk1 3 3 8 225 3 active sync /dev/sdo1 4 4 8 129 4 active sync /dev/sdi1 5 5 8 209 5 active sync /dev/sdn1
Personalities : [raid1] [raid5] [raid6] md1 : active raid1 sdb2[1] sda2[0] 1951808 blocks [2/2] [UU] md2 : active raid1 sdb3[1] sda3[0] 73200064 blocks [2/2] [UU] md3 : active (read-only) raid5 sdh1[4] sdf1[1] sde1[3] sdd1[5] sdc1[0] 995708160 blocks level 5, 256k chunk, algorithm 2 [6/5] [UU_UUU] md4 : active (read-only) raid5 sdn1[5] sdm1[0] sdl1[1] sdk1[2] sdi1[4] 1465248000 blocks level 5, 256k chunk, algorithm 2 [6/5] [UUU_UU] md0 : active raid1 sdb1[1] sda1[0] 4883648 blocks [2/2] [UU] unused devices: <none>