Hi all, I have some trouble in recreating a likely undamaged RAID5-Array with 4 disks. Initially I created the array one year ago: mdadm --create /dev/md0 --level=5 --chunk=128 --raid-devices=4 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/md0 is base for a luks container which is used as physical volume for LVM. Recently I moved the array to a new machine, all working fine. Then I have experimented with hdparm to put drives into sleep-mode. After a few sleep/wake-cycles all still working fine. Sunday an event occured that 2 of the 4 disks dind't wake up properly. I waked up the array by issuing this command: dd if=/dev/zero bs=512 count=1 > mnt/tmp/test.img (mnt/tmp resides in the array/luks/lvm) I cannot remember if an error message occured, but I see that 2 drives aren't accessible. I decided to restart the machine to get drives working. After restart all 4 drives are working, but the array didn't came up because of 2 missing drives. After some googling I found https://raid.wiki.kernel.org/index.php/RAID_Recovery This guide seems to deal with my problem. I tried starting recreating my array with this steps: 1. mdadm --examine /dev/sd[acde]1 > raid.status raid.status contains this: /dev/sda1: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 65089aca:483766d3:0db55271:27c73384 Name : oldman:0 Creation Time : Sun Jun 26 14:35:51 2011 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 3907024896 (1863.01 GiB 2000.40 GB) Array Size : 5860536576 (5589.04 GiB 6001.19 GB) Used Dev Size : 3907024384 (1863.01 GiB 2000.40 GB) Data Offset : 2048 sectors Super Offset : 8 sectors State : clean Device UUID : 3d7a888c:7488a859:f53a0fe5:5bbc1bda Update Time : Sat Jul 14 22:08:08 2012 Checksum : eeb6d1f1 - correct Events : 266 Layout : left-symmetric Chunk Size : 128K Device Role : Active device 3 Array State : AAAA ('A' == active, '.' == missing) /dev/sdc1: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 65089aca:483766d3:0db55271:27c73384 Name : oldman:0 Creation Time : Sun Jun 26 14:35:51 2011 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 3907024896 (1863.01 GiB 2000.40 GB) Array Size : 5860536576 (5589.04 GiB 6001.19 GB) Used Dev Size : 3907024384 (1863.01 GiB 2000.40 GB) Data Offset : 2048 sectors Super Offset : 8 sectors State : clean Device UUID : 578cb781:387a3537:e42e9414:3d6e25e7 Update Time : Sat Jul 14 22:08:08 2012 Checksum : fe017b9c - correct Events : 266 Layout : left-symmetric Chunk Size : 128K Device Role : Active device 1 Array State : AAAA ('A' == active, '.' == missing) /dev/sdd1: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 65089aca:483766d3:0db55271:27c73384 Name : oldman:0 Creation Time : Sun Jun 26 14:35:51 2011 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 3907024896 (1863.01 GiB 2000.40 GB) Array Size : 5860536576 (5589.04 GiB 6001.19 GB) Used Dev Size : 3907024384 (1863.01 GiB 2000.40 GB) Data Offset : 2048 sectors Super Offset : 8 sectors State : clean Device UUID : 3dc3e9e2:ab9a2199:5618c6cd:ca0631fa Update Time : Sun Jul 15 11:25:20 2012 Checksum : 8d5c0fd0 - correct Events : 271 Layout : left-symmetric Chunk Size : 128K Device Role : Active device 2 Array State : A.A. ('A' == active, '.' == missing) /dev/sde1: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 65089aca:483766d3:0db55271:27c73384 Name : oldman:0 Creation Time : Sun Jun 26 14:35:51 2011 Raid Level : raid5 Raid Devices : 4 Avail Dev Size : 3907024896 (1863.01 GiB 2000.40 GB) Array Size : 5860536576 (5589.04 GiB 6001.19 GB) Used Dev Size : 3907024384 (1863.01 GiB 2000.40 GB) Data Offset : 2048 sectors Super Offset : 8 sectors State : clean Device UUID : 22ecbee2:a26678a3:c2aeca18:007edd48 Update Time : Sun Jul 15 11:25:20 2012 Checksum : 3139124b - correct Events : 271 Layout : left-symmetric Chunk Size : 128K Device Role : Active device 0 Array State : A.A. ('A' == active, '.' == missing) 2. grep Role raid.status gives: Device Role : Active device 3 Device Role : Active device 1 Device Role : Active device 2 Device Role : Active device 0 3. mdadm --create --assume-clean --level=5 --chunk=128 --raid-devices=4 /dev/md0 /dev/sde1 /dev/sdc1 /dev/sdd1 /dev/sda1 THis gives the message that all drives apears to be part of a raid array. I confirmed "Continue creating array" and /dev/md0 was created. But data in /dev/md0 is corrupted. I cannot luksOpen the device. The first bytes of /dev/md0 did not contain the luks header. Something went wrong in recreating the array. I've build a simple script that tries to create the array and luksOpen it in all possible 24 permutations in disk-arrangement, but none of them brings me up my data. I'm pretty sure that I didn't write something to the array, except the 512 bytes to wake up it from sleep (see above). The kernel messages around the time drive didn't respond are: Jul 15 11:21:12 EBENE kernel: [30046.863193] sd 2:0:0:0: [sdc] Unhandled error code Jul 15 11:21:12 EBENE kernel: [30046.863197] sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Jul 15 11:21:12 EBENE kernel: [30046.863201] sd 2:0:0:0: [sdc] CDB: Read(10): 28 00 2f 91 75 00 00 00 a0 00 Jul 15 11:21:12 EBENE kernel: [30046.863374] sd 0:0:0:0: [sda] Unhandled error code Jul 15 11:21:12 EBENE kernel: [30046.863376] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Jul 15 11:21:12 EBENE kernel: [30046.863378] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 2f 91 75 00 00 00 a0 00 Jul 15 11:21:12 EBENE kernel: [30046.864442] sd 0:0:0:0: [sda] Unhandled error code Jul 15 11:21:12 EBENE kernel: [30046.864445] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Jul 15 11:21:12 EBENE kernel: [30046.864449] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 2f 91 76 00 00 00 a0 00 Jul 15 11:21:12 EBENE kernel: [30046.864508] sd 2:0:0:0: [sdc] Unhandled error code Jul 15 11:21:12 EBENE kernel: [30046.864510] sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Jul 15 11:21:12 EBENE kernel: [30046.864513] sd 2:0:0:0: [sdc] CDB: Read(10): 28 00 2f 91 75 a0 00 00 60 00 Jul 15 11:21:12 EBENE kernel: [30046.864638] sd 0:0:0:0: [sda] Unhandled error code Jul 15 11:21:12 EBENE kernel: [30046.864640] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Jul 15 11:21:12 EBENE kernel: [30046.864642] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 2f 91 75 a0 00 00 60 00 Jul 15 11:21:12 EBENE kernel: [30046.864680] sd 2:0:0:0: [sdc] Unhandled error code Jul 15 11:21:12 EBENE kernel: [30046.864681] sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Jul 15 11:21:12 EBENE kernel: [30046.864683] sd 2:0:0:0: [sdc] CDB: Read(10): 28 00 2f 91 76 00 00 00 a0 00 Jul 15 11:23:38 EBENE kernel: [30192.746530] sd 0:0:0:0: [sda] Unhandled error code Jul 15 11:23:38 EBENE kernel: [30192.746534] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Jul 15 11:23:38 EBENE kernel: [30192.746539] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 20 00 Jul 15 11:23:38 EBENE kernel: [30192.746810] sd 0:0:0:0: [sda] Unhandled error code Jul 15 11:23:38 EBENE kernel: [30192.746812] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK Jul 15 11:23:38 EBENE kernel: [30192.746815] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00 I have modified the Permute_array.pl on the wiki page to not mount /dev/md0 but print out the first 4 bytes of /dev/md0. These must be "LUKS", if array is recreated in a succesful way. But I didn't ge this result after all permutations. I'm afraid of having killed my array by not using "missing" on some drive in the first try (see above Point 3). Will may array remain corrupted? Any suggestions? Thanks in advance André -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html