For various reasons, the email notifications on my RAID6 array wasn't working, and 2 of the 15 drives failed out. I noticed this last week as I was about to move the server into a new case. As part of the move, I upgraded the OS to the latest CentOS, as I was having issues with the existing install and the new HBA card (a SASLP-MV8). When the server came back up, for some reason it decided to fire up the md array with only 1 drive - and it incremented the Event count on that 1 drive (and since I'm running with 2 failed drives on a RAID6, I couldn't kick the drive out and let it rebuild). The array shows this... mdadm --detail /dev/md0 /dev/md0: Version : 1.2 Creation Time : Sat Jun 5 10:38:11 2010 Raid Level : raid6 Used Dev Size : 488383488 (465.76 GiB 500.10 GB) Raid Devices : 15 Total Devices : 12 Persistence : Superblock is persistent Update Time : Mon Apr 9 13:05:31 2012 State : active, FAILED, Not Started Active Devices : 12 Working Devices : 12 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 512K Name : file00bert.woodlea.org.uk:0 (local to host file00bert.woodlea.org.uk) UUID : 1470c671:4236b155:67287625:899db153 Events : 1378022 Number Major Minor RaidDevice State 0 8 113 0 active sync /dev/sdh1 1 8 209 1 active sync /dev/sdn1 2 8 225 2 active sync /dev/sdo1 15 8 17 3 active sync /dev/sdb1 4 8 145 4 active sync /dev/sdj1 5 8 161 5 active sync /dev/sdk1 6 0 0 6 removed 7 8 81 7 active sync /dev/sdf1 8 8 97 8 active sync /dev/sdg1 16 8 65 9 active sync /dev/sde1 10 8 33 10 active sync /dev/sdc1 11 0 0 11 removed 12 8 177 12 active sync /dev/sdl1 13 8 241 13 active sync /dev/sdp1 14 0 0 14 removed Looking at the Event count on all the drives as they currently are, they show this sda1 1378024 sdb1 1378022 sdc1 1378022 sdd1 1362956 sde1 1378022 sdf1 1378022 sdg1 1378022 sdh1 1378022 sdj1 1378022 sdk1 1378022 sdl1 1378022 sdm1 616796 sdn1 1378022 sdo1 1378022 sdp1 1378022 So, /dev/sdd1 and /dev/sdm1 are the 2 failed drives. The Event count on all the other drives agree with each other, and with that of the array, except for /dev/sda1, which is a couple of events higher than everything else - and with that I can't start the array. Since I know I did nothing with the temp one drive array when the server was booted (and I don't think that the md code did anything either??) would it be safe to mdadm --assemble /dev/md0 /dev/sd[a-c]1 /dev/sd[e-h]1 /dev/sd[j-l]1 /dev/sd[n-p]1 --force to let the array come back up and get it running? What would then be the correct sequence to replace the 2 failed drives (sdd1 and sdm1) and get the array running fully again? Thanks for your help. YP. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html