I've just had a 3rd drive fail on one of my RAID 6 arrays, and I'm looking for some advice on how to get it back enough that I can recover the data, and then replacing the other failed drives. mdadm -V mdadm - v3.0.3 - 22nd October 2009 Not the most up to date release, but it seems to be the latest one available on FC12 The /etc/mdadm.conf file is ARRAY /dev/md0 uuid=1470c671:4236b155:67287625:899db153 Which explains why I didn't get emailed about the drive failures. This isn't my standard file, and I don't know how it was changed, but that's another issue for another day. mdadm --detail /dev/md0 /dev/md0: Version : 1.2 Creation Time : Sat Jun 5 10:38:11 2010 Raid Level : raid6 Used Dev Size : 488383488 (465.76 GiB 500.10 GB) Raid Devices : 15 Total Devices : 12 Persistence : Superblock is persistent Update Time : Tue Mar 1 22:17:41 2011 State : active, degraded, Not Started Active Devices : 12 Working Devices : 12 Failed Devices : 0 Spare Devices : 0 Chunk Size : 512K Name : file00bert.woodlea.org.uk:0 (local to host file00bert.woodlea.org.uk) UUID : 1470c671:4236b155:67287625:899db153 Events : 254890 Number Major Minor RaidDevice State 0 8 113 0 active sync /dev/sdh1 1 8 17 1 active sync /dev/sdb1 2 8 177 2 active sync /dev/sdl1 3 0 0 3 removed 4 8 33 4 active sync /dev/sdc1 5 8 193 5 active sync /dev/sdm1 6 0 0 6 removed 7 8 49 7 active sync /dev/sdd1 8 8 209 8 active sync /dev/sdn1 9 8 161 9 active sync /dev/sdk1 10 0 0 10 removed 11 8 225 11 active sync /dev/sdo1 12 8 81 12 active sync /dev/sdf1 13 8 241 13 active sync /dev/sdp1 14 8 1 14 active sync /dev/sda1 The output from the failed drives are as follows. mdadm --examine /dev/sde1 /dev/sde1: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : 1470c671:4236b155:67287625:899db153 Name : file00bert.woodlea.org.uk:0 (local to host file00bert.woodlea.org.uk) Creation Time : Sat Jun 5 10:38:11 2010 Raid Level : raid6 Raid Devices : 15 Avail Dev Size : 976767730 (465.76 GiB 500.11 GB) Array Size : 12697970688 (6054.86 GiB 6501.36 GB) Used Dev Size : 976766976 (465.76 GiB 500.10 GB) Data Offset : 272 sectors Super Offset : 8 sectors State : clean Device UUID : 3e284f2e:d939fb97:0b74eb88:326e879c Internal Bitmap : 2 sectors from superblock Update Time : Tue Mar 1 21:53:31 2011 Checksum : 768f0f34 - correct Events : 254591 Chunk Size : 512K Device Role : Active device 10 Array State : AAA.AA.AAAAAAAA ('A' == active, '.' == missing) The above is the drive that failed tonight, and the one I would like to re add back into the array. There have been no writes to the filesystem on the array in the last couple of days (other than what ext4 would do on it's own). mdadm --examine /dev/sdi1 /dev/sdi1: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : 1470c671:4236b155:67287625:899db153 Name : file00bert.woodlea.org.uk:0 (local to host file00bert.woodlea.org.uk) Creation Time : Sat Jun 5 10:38:11 2010 Raid Level : raid6 Raid Devices : 15 Avail Dev Size : 976767730 (465.76 GiB 500.11 GB) Array Size : 12697970688 (6054.86 GiB 6501.36 GB) Used Dev Size : 976766976 (465.76 GiB 500.10 GB) Data Offset : 272 sectors Super Offset : 8 sectors State : active Device UUID : 8e668e39:06d8281b:b79aa3ab:a1d55fb5 Internal Bitmap : 2 sectors from superblock Update Time : Thu Feb 10 18:20:54 2011 Checksum : 4078396b - correct Events : 254075 Chunk Size : 512K Device Role : Active device 3 Array State : AAAAAA.AAAAAAAA ('A' == active, '.' == missing) mdadm --examine /dev/sdj1 /dev/sdj1: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : 1470c671:4236b155:67287625:899db153 Name : file00bert.woodlea.org.uk:0 (local to host file00bert.woodlea.org.uk) Creation Time : Sat Jun 5 10:38:11 2010 Raid Level : raid6 Raid Devices : 15 Avail Dev Size : 976767730 (465.76 GiB 500.11 GB) Array Size : 12697970688 (6054.86 GiB 6501.36 GB) Used Dev Size : 976766976 (465.76 GiB 500.10 GB) Data Offset : 272 sectors Super Offset : 8 sectors State : active Device UUID : 37d422cc:8436960a:c3c4d11c:81a8e4fa Internal Bitmap : 2 sectors from superblock Update Time : Thu Oct 21 23:45:06 2010 Checksum : 78950bb5 - correct Events : 21435 Chunk Size : 512K Device Role : Active device 6 Array State : AAAAAAAAAAAAAAA ('A' == active, '.' == missing) Looks like sdj1 failed waaay back in Oct last year (sigh). As I said, I am not to bothered about adding these last 2 drives back into the array, since they failed so long ago. I have a couple of spare drives sitting here, and I will replace these 2 drives with them (once I have completed a badblocks on them). Looking at the output of dmesg, there are no other errors showing for the 3 drives, other than them being kicked out of the array for being non fresh. I guess I have a couple of questions. What's the correct process for adding the failed /dev/sde1 back into the array so I can start it. I don't want to rush into this and make things worse. What's the correct process for replacing the 2 other drives? I am presuming that I need to --fail, then --remove then --add the drives (one at a time?), but I want to make sure. Thanks for your help. Graham. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html