Not sure what caused the original problem. There was a failure when the user tried to grow the array. Then I was called in for the recovery. And I can report success. Thank you Neil, with the added step of fixing the checksum your instructions worked perfectly and all data was recovered. Veedar On Mon, Jul 15, 2013 at 9:35 PM, NeilBrown <neilb@xxxxxxx> wrote: > On Sat, 13 Jul 2013 16:01:20 -0400 Veedar Hokstadt <veedar@xxxxxxxxx> wrote: > >> Hello, Please consider the following RAID5 recovery attempt after a >> failed partial reshape. > > What were the sequence of events that lead to failure? > > >> Copy-on-write devices were created to protect original drives. >> Any assistance on how to reassemble would be most welcome. > > As you say, it looks like sdf1 is confused somehow. But it is your only > hope, so let's hope it isn't confused too much. sdc is definitely not useful. > > sdf1 has a 'recovery offset' which I wouldn't expect. It lines up exactly > with the reshape position which suggests that it is spare which is being > rebuilt during the reshape process. > Did sdf1 fail and get re-added some time since the reshape started? > > My guess is your best bet is to use a binary editor on the metadata in sdf1 - > it is 4K from the start of the device. > Change the feature map (8 bytes from start of block) from '6' to '4', to say > that the recovery has finished. > > Then look at the "dev_roles" array for 16bit numbers, starting 256 bytes into > the metadata. This should be the same on each device. The role '0' should > not be present (make it 0xffff if it is there) and 1,2,3,4,5 should all be > present. > Then look at the 'dev_number' field in sdf1 - 160 bytes into the metadata. > This 4byte number should be the index in dev_roles where '3' appears. > > If you make those changes, then try to assemble again. Hopefully it will > work.... > > NeilBrown > > > >> >> ...Operating environment is from a systemrescuecd... >> % mdadm -V >> mdadm - v3.1.4 - 31st August 2010 >> % /usr/local/sbin/mdadm -V <<<<<< compiled latest by hand >> mdadm - v3.2.6 - 25th October 2012 >> % uname -a >> Linux dallas 3.2.33-std311-amd64 #2 SMP Wed Oct 31 07:31:30 UTC 2012 >> x86_64 Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz GenuineIntel GNU/Linux >> >> ...Drive /dev/mapper/cow_sdc1 appears damaged and goes offline >> sporadically, so I'm trying to reassemble with out sdc1... >> ...In any case sdc1 is out of sync with the other drives and it's >> reshape pos'n is at zero... >> ...Also /usb/foo is an empty file... >> >> % export MDADM_GROW_ALLOW_OLD=1 >> % /usr/local/sbin/mdadm -vv --assemble --force >> --backup-file=/usb/foo /dev/md2 /dev/mapper/cow_sdd1 >> /dev/mapper/cow_sde1 /dev/mapper/cow_sdf1 /dev/mapper/cow_sdg1 >> /dev/mapper/cow_sdh1 >> mdadm: looking for devices for /dev/md2 >> mdadm: /dev/mapper/cow_sdd1 is identified as a member of /dev/md2, slot 1. >> mdadm: /dev/mapper/cow_sde1 is identified as a member of /dev/md2, slot 2. >> mdadm: /dev/mapper/cow_sdf1 is identified as a member of /dev/md2, slot -1. >> mdadm: /dev/mapper/cow_sdg1 is identified as a member of /dev/md2, slot 4. >> mdadm: /dev/mapper/cow_sdh1 is identified as a member of /dev/md2, slot 5. >> mdadm:/dev/md2 has an active reshape - checking if critical section >> needs to be restored >> mdadm: Cannot read from /usb/foo >> mdadm: accepting backup with timestamp 1372908503 for array with >> timestamp 1373237070 >> mdadm: backup-metadata found on device-5 but is not needed >> mdadm: No backup metadata on device-6 >> mdadm: no uptodate device for slot 0 of /dev/md2 >> mdadm: added /dev/mapper/cow_sde1 to /dev/md2 as 2 >> mdadm: no uptodate device for slot 3 of /dev/md2 >> mdadm: added /dev/mapper/cow_sdg1 to /dev/md2 as 4 >> mdadm: added /dev/mapper/cow_sdh1 to /dev/md2 as 5 >> mdadm: added /dev/mapper/cow_sdf1 to /dev/md2 as -1 (possibly out of date) >> mdadm: added /dev/mapper/cow_sdd1 to /dev/md2 as 1 >> mdadm: /dev/md2 assembled from 4 drives - not enough to start the array. >> >> ...Noticed a difference to mdstat after --run, not sure if it is significant... >> % cat /proc/mdstat >> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] >> [raid4] [raid10] >> md2 : inactive dm-1[5](S) dm-5[4](S) dm-9[7](S) dm-7[6](S) dm-3[3](S) >> <<<<<<<<<<<< note five (S)'s >> 14650675369 blocks super 1.2 >> unused devices: <none> >> % /usr/local/sbin/mdadm -vv --run /dev/md2 >> mdadm: failed to run array /dev/md2: Input/output error >> % cat /proc/mdstat >> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] >> [raid4] [raid10] >> md2 : inactive dm-1[5] dm-5[4](F) dm-9[7] dm-7[6] dm-3[3] >> <<<<<<<<<<<< note difference >> 11720539894 blocks super 1.2 >> unused devices: <none> >> >> ....Info from mdadm --examine... >> mdadm -E /dev/mapper/cow_sdc1 /dev/mapper/cow_sdd1 >> /dev/mapper/cow_sde1 /dev/mapper/cow_sdf1 /dev/mapper/cow_sdg1 >> /dev/mapper/cow_sdh1 >> >> /dev/mapper/cow_sdc1: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x4 >> Array UUID : a0071bbe:16fe9e3b:76ce40a8:754d0200 >> Name : tron:0 >> Creation Time : Sat Dec 22 23:26:19 2012 >> Raid Level : raid5 >> Raid Devices : 6 >> Avail Dev Size : 5862022855 (2795.23 GiB 3001.36 GB) >> Array Size : 29301340160 (13971.97 GiB 15002.29 GB) >> Used Dev Size : 5860268032 (2794.39 GiB 3000.46 GB) >> Data Offset : 262144 sectors >> Super Offset : 8 sectors >> State : clean >> Device UUID : 9eacfd8d:92eb403b:4408be7f:601e36b5 >> Reshape pos'n : 0 >> <<<<<< reshape at zero >> Delta Devices : 1 (5->6) >> Update Time : Thu Jul 4 03:27:43 2013 <<<<<< out of sync >> Checksum : 14fae7a3 - correct >> Events : 125183 >> Layout : left-symmetric >> Chunk Size : 512K >> Device Role : Active device 0 >> Array State : AAAAAA ('A' == active, '.' == missing) >> >> /dev/mapper/cow_sdd1: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x4 >> Array UUID : a0071bbe:16fe9e3b:76ce40a8:754d0200 >> Name : tron:0 >> Creation Time : Sat Dec 22 23:26:19 2012 >> Raid Level : raid5 >> Raid Devices : 6 >> Avail Dev Size : 5860270951 (2794.40 GiB 3000.46 GB) >> Array Size : 29301340160 (13971.97 GiB 15002.29 GB) >> Used Dev Size : 5860268032 (2794.39 GiB 3000.46 GB) >> Data Offset : 262144 sectors >> Super Offset : 8 sectors >> State : clean >> Device UUID : 81087206:02b470b1:6c06cb8b:63c79b21 >> Reshape pos'n : 12080240640 (11520.62 GiB 12370.17 GB) >> Delta Devices : 1 (5->6) >> Update Time : Sun Jul 7 22:44:30 2013 >> Checksum : 1c10ab66 - correct >> Events : 125181 >> Layout : left-symmetric >> Chunk Size : 512K >> Device Role : Active device 1 >> Array State : .AAAAA ('A' == active, '.' == missing) >> >> /dev/mapper/cow_sde1: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x4 >> Array UUID : a0071bbe:16fe9e3b:76ce40a8:754d0200 >> Name : tron:0 >> Creation Time : Sat Dec 22 23:26:19 2012 >> Raid Level : raid5 >> Raid Devices : 6 >> Avail Dev Size : 5860268943 (2794.39 GiB 3000.46 GB) >> Array Size : 29301340160 (13971.97 GiB 15002.29 GB) >> Used Dev Size : 5860268032 (2794.39 GiB 3000.46 GB) >> Data Offset : 262144 sectors >> Super Offset : 8 sectors >> State : clean >> Device UUID : a7d341d2:392c9c31:0e28e8e2:865b56a9 >> Reshape pos'n : 12080240640 (11520.62 GiB 12370.17 GB) >> Delta Devices : 1 (5->6) >> Update Time : Sun Jul 7 22:44:30 2013 >> Checksum : 46e39caf - correct >> Events : 125181 >> Layout : left-symmetric >> Chunk Size : 512K >> Device Role : Active device 2 >> Array State : .AAAAA ('A' == active, '.' == missing) >> >> /dev/mapper/cow_sdf1: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x6 >> Array UUID : a0071bbe:16fe9e3b:76ce40a8:754d0200 >> Name : tron:0 >> Creation Time : Sat Dec 22 23:26:19 2012 >> Raid Level : raid5 >> Raid Devices : 6 >> Avail Dev Size : 5860270951 (2794.40 GiB 3000.46 GB) >> Array Size : 29301340160 (13971.97 GiB 15002.29 GB) >> Used Dev Size : 5860268032 (2794.39 GiB 3000.46 GB) >> Data Offset : 262144 sectors >> Super Offset : 8 sectors >> Recovery Offset : 4832096256 sectors >> State : active >> Device UUID : 332d8290:ec203a26:df299919:9f779aa7 >> Reshape pos'n : 12080240640 (11520.62 GiB 12370.17 GB) >> Delta Devices : 1 (5->6) >> Update Time : Sun Jul 7 22:45:42 2013 >> Checksum : 4eaf00f5 - correct >> Events : 125183 >> Layout : left-symmetric >> Chunk Size : 512K >> Device Role : spare >> Array State : ...... ('A' == active, '.' == missing) >> >> /dev/mapper/cow_sdg1: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x4 >> Array UUID : a0071bbe:16fe9e3b:76ce40a8:754d0200 >> Name : tron:0 >> Creation Time : Sat Dec 22 23:26:19 2012 >> Raid Level : raid5 >> Raid Devices : 6 >> Avail Dev Size : 5860270951 (2794.40 GiB 3000.46 GB) >> Array Size : 29301340160 (13971.97 GiB 15002.29 GB) >> Used Dev Size : 5860268032 (2794.39 GiB 3000.46 GB) >> Data Offset : 262144 sectors >> Super Offset : 8 sectors >> State : clean >> Device UUID : ca37a376:12fa661f:844f2740:cab22de8 >> Reshape pos'n : 12080240640 (11520.62 GiB 12370.17 GB) >> Delta Devices : 1 (5->6) >> Update Time : Sun Jul 7 22:44:30 2013 >> Checksum : 7526553f - correct >> Events : 125181 >> Layout : left-symmetric >> Chunk Size : 512K >> Device Role : Active device 4 >> Array State : .AAAAA ('A' == active, '.' == missing) >> >> /dev/mapper/cow_sdh1: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x4 >> Array UUID : a0071bbe:16fe9e3b:76ce40a8:754d0200 >> Name : tron:0 >> Creation Time : Sat Dec 22 23:26:19 2012 >> Raid Level : raid5 >> Raid Devices : 6 >> Avail Dev Size : 5860268943 (2794.39 GiB 3000.46 GB) >> Array Size : 29301340160 (13971.97 GiB 15002.29 GB) >> Used Dev Size : 5860268032 (2794.39 GiB 3000.46 GB) >> Data Offset : 262144 sectors >> Super Offset : 8 sectors >> State : clean >> Device UUID : e02598c3:708630c9:e666b0cf:4189fbb0 >> Reshape pos'n : 12080240640 (11520.62 GiB 12370.17 GB) >> Delta Devices : 1 (5->6) >> Update Time : Sun Jul 7 22:44:30 2013 >> Checksum : c43bb5b6 - correct >> Events : 125181 >> Layout : left-symmetric >> Chunk Size : 512K >> Device Role : Active device 5 >> Array State : .AAAAA ('A' == active, '.' == missing) >> >> ...Thank you for your help. Veedar... >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html