Just to double check, would this be the right command to run? mdadm --create --assume-clean --level=5 --size=5858385408 --raid-devices=4 /dev/md0 missing /dev/sdb3 /dev/sdc3 /dev/sdd3 Are there any other options I would need to add? Should I specify --chunk and --size (and if I entered the right size)? By the way thanks for the help. On Mon, Dec 8, 2014 at 4:48 AM, Robin Hill <robin@xxxxxxxxxxxxxxx> wrote: > On Sat Dec 06, 2014 at 03:49:10PM -0500, Emery Guevremont wrote: >> On Sat, Dec 6, 2014 at 1:56 PM, Robin Hill <robin@xxxxxxxxxxxxxxx> wrote: >> > On Sat Dec 06, 2014 at 01:35:50pm -0500, Emery Guevremont wrote: >> > >> >> The long story and what I've done. >> >> >> >> /dev/md0 is assembled with 4 drives >> >> /dev/sda3 >> >> /dev/sdb3 >> >> /dev/sdc3 >> >> /dev/sdd3 >> >> >> >> 2 weeks ago, mdadm marked /dev/sda3 as failed. cat /proc/mdstat showed >> >> _UUU. smarctl also confirmed that the drive was dying. So I shutdown >> >> the server and until I received a replacement drive. >> >> >> >> This week, I replaced the dying drive with my new drive. Booted into >> >> single user mode and did this: >> >> >> >> mdadm --manage /dev/md0 --add /dev/sda3 a cat of /proc/mdstat >> >> confirmed the resyncing process. The last time I checked it was up to >> >> 11%. After a few minutes later, I noticed that the syncing stopped. A >> >> read error message on /dev/sdd3 (have a pic of it if interested) >> >> appear on the console. It appears that /dev/sdd3 might be going bad. A >> >> cat /proc/mdstat showed _U_U. Now I panic, and decide to leave >> >> everything as is and to go to bed. >> >> >> >> The next day, I shutdown the server and reboot with a live usb distro >> >> (Ubuntu rescue remix). After booting into the live distro, a cat >> >> /proc/mdstat showed that my /dev/md0 was detected but all drives had >> >> an (S) next to it. i.e. /dev/sda3 (S)... Naturally I don't like the >> >> looks of this. >> >> >> >> I ran ddrescue to copy /dev/sdd onto my new replacement disk >> >> (/dev/sda). Everything, worked, ddrescue got only one read error, but >> >> was eventually able to read the bad sector on a retry. I followed up >> >> by also cloning with ddrescue, sdb and sdc. >> >> >> >> So now I have cloned copies of sdb, sdc and sdd to work with. >> >> Currently running mdadm --assemble --scan, will activate my array, but >> >> all drives are added as spares. Running mdadm --examine on each >> >> drives, shows the same Array UUID number, but the Raid Devices is 0 >> >> and raid level is -unknown- for some reason. The rest seems fine and >> >> makes sense. I believe I could re-assemble my array if I could define >> >> the raid level and raid devices. >> >> >> >> I wanted to know if there are a way to restore my superblocks from the >> >> examine command I ran at the beginning? If not, what mdadm create >> >> command should I run? Also please let me know if drive ordering is >> >> important, and how I can determine this with the examine output I'll >> >> got? >> >> >> >> Thank you. >> >> >> > Have you tried --assemble --force? You'll need to make sure the array's >> > stopped first, but that's the usual way to get the array back up and >> > running in that sort of situation. >> > >> > If that doesn't work, stop the array again and post: >> > - the output from mdadm --assemble --force --verbose /dev/md0 /dev/sd[bcd]3 >> > - any dmesg output corresponding with the above >> > - --examine output for all disks >> > - kernel and mdadm versions >> > >> > Good luck, >> > Robin > >> You'll see from the examine output, raid level and devices aren't >> defined and notice the role of each drives. The examine output (I >> attached 4 files) that I took right after the read error during the >> synching process seems to show a more accurate superblock. Here's also >> the output of mdadm --detail /dev/md0 that I took when I got the first >> error: >> >> ARRAY /dev/md/0 metadata=1.2 UUID=cf9db8fa:0c2bb553:46865912:704cceae >> name=runts:0 >> spares=1 >> >> >> Here's the output of how things currently are: >> >> mdadm --assemble --force /dev/md127 /dev/sdb3 /dev/sdc3 /dev/sdd3 >> mdadm: /dev/md127 assembled from 0 drives and 3 spares - not enough to >> start the array. >> >> dmesg >> [27903.423895] md: md127 stopped. >> [27903.434327] md: bind<sdc3> >> [27903.434767] md: bind<sdd3> >> [27903.434963] md: bind<sdb3> >> >> cat /proc/mdstat >> root@ubuntu:~# cat /proc/mdstat >> Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] >> [raid1] [raid10] >> md127 : inactive sdb3[4](S) sdd3[0](S) sdc3[5](S) >> 5858387208 blocks super 1.2 >> >> mdadm --examine /dev/sd[bcd]3 >> /dev/sdb3: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x0 >> Array UUID : cf9db8fa:0c2bb553:46865912:704cceae >> Name : runts:0 >> Creation Time : Tue Jul 26 03:27:39 2011 >> Raid Level : -unknown- >> Raid Devices : 0 >> >> Avail Dev Size : 3905591472 (1862.33 GiB 1999.66 GB) >> Data Offset : 2048 sectors >> Super Offset : 8 sectors >> State : active >> Device UUID : b2bf0462:e0722254:0e233a72:aa5df4da >> >> Update Time : Sat Dec 6 12:46:40 2014 >> Checksum : 5e8cfc9a - correct >> Events : 1 >> >> >> Device Role : spare >> Array State : ('A' == active, '.' == missing) >> /dev/sdc3: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x0 >> Array UUID : cf9db8fa:0c2bb553:46865912:704cceae >> Name : runts:0 >> Creation Time : Tue Jul 26 03:27:39 2011 >> Raid Level : -unknown- >> Raid Devices : 0 >> >> Avail Dev Size : 3905591472 (1862.33 GiB 1999.66 GB) >> Data Offset : 2048 sectors >> Super Offset : 8 sectors >> State : active >> Device UUID : 390bd4a2:07a28c01:528ed41e:a9d0fcf0 >> >> Update Time : Sat Dec 6 12:46:40 2014 >> Checksum : f69518c - correct >> Events : 1 >> >> >> Device Role : spare >> Array State : ('A' == active, '.' == missing) >> /dev/sdd3: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x0 >> Array UUID : cf9db8fa:0c2bb553:46865912:704cceae >> Name : runts:0 >> Creation Time : Tue Jul 26 03:27:39 2011 >> Raid Level : -unknown- >> Raid Devices : 0 >> >> Avail Dev Size : 3905591472 (1862.33 GiB 1999.66 GB) >> Data Offset : 2048 sectors >> Super Offset : 8 sectors >> State : active >> Device UUID : 92589cc2:9d5ed86c:1467efc2:2e6b7f09 >> >> Update Time : Sat Dec 6 12:46:40 2014 >> Checksum : 571ad2bd - correct >> Events : 1 >> >> >> Device Role : spare >> Array State : ('A' == active, '.' == missing) >> >> and finally kernel and mdadm versions: >> >> uname -a >> Linux ubuntu 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:41:14 UTC >> 2012 i686 i686 i386 GNU/Linux >> >> mdadm -V >> mdadm - v3.2.3 - 23rd December 2011 > > The missing data looks similar to a bug fixed a couple of years ago > (http://neil.brown.name/blog/20120615073245), though the kernel versions > don't match and the missing data is somewhat different - it may be that > the relevant patches were backported to the vendor kernel you're using. > > With that data missing there's no way to assemble though, so a re-create > is required in this case (it's a last resort, but I don't see any other > option). > >> /dev/sda3: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x0 >> Array UUID : cf9db8fa:0c2bb553:46865912:704cceae >> Name : runts:0 (local to host runts) >> Creation Time : Mon Jul 25 23:27:39 2011 >> Raid Level : raid5 >> Raid Devices : 4 >> >> Avail Dev Size : 3905591472 (1862.33 GiB 1999.66 GB) >> Array Size : 5858385408 (5586.99 GiB 5998.99 GB) >> Used Dev Size : 3905590272 (1862.33 GiB 1999.66 GB) >> Data Offset : 2048 sectors >> Super Offset : 8 sectors >> State : clean >> Device UUID : b2bf0462:e0722254:0e233a72:aa5df4da >> >> Update Time : Tue Dec 2 23:15:37 2014 >> Checksum : 5ed5b898 - correct >> Events : 3925676 >> >> Layout : left-symmetric >> Chunk Size : 512K >> >> Device Role : spare >> Array State : A.A. ('A' == active, '.' == missing) > >> /dev/sdb3: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x0 >> Array UUID : cf9db8fa:0c2bb553:46865912:704cceae >> Name : runts:0 (local to host runts) >> Creation Time : Mon Jul 25 23:27:39 2011 >> Raid Level : raid5 >> Raid Devices : 4 >> >> Avail Dev Size : 3905591472 (1862.33 GiB 1999.66 GB) >> Array Size : 5858385408 (5586.99 GiB 5998.99 GB) >> Used Dev Size : 3905590272 (1862.33 GiB 1999.66 GB) >> Data Offset : 2048 sectors >> Super Offset : 8 sectors >> State : clean >> Device UUID : 92589cc2:9d5ed86c:1467efc2:2e6b7f09 >> >> Update Time : Tue Dec 2 23:15:37 2014 >> Checksum : 57638ebb - correct >> Events : 3925676 >> >> Layout : left-symmetric >> Chunk Size : 512K >> >> Device Role : Active device 0 >> Array State : A.A. ('A' == active, '.' == missing) > >> /dev/sdc3: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x0 >> Array UUID : cf9db8fa:0c2bb553:46865912:704cceae >> Name : runts:0 (local to host runts) >> Creation Time : Mon Jul 25 23:27:39 2011 >> Raid Level : raid5 >> Raid Devices : 4 >> >> Avail Dev Size : 3905591472 (1862.33 GiB 1999.66 GB) >> Array Size : 5858385408 (5586.99 GiB 5998.99 GB) >> Used Dev Size : 3905590272 (1862.33 GiB 1999.66 GB) >> Data Offset : 2048 sectors >> Super Offset : 8 sectors >> State : clean >> Device UUID : 390bd4a2:07a28c01:528ed41e:a9d0fcf0 >> >> Update Time : Tue Dec 2 23:15:37 2014 >> Checksum : fb20d8a - correct >> Events : 3925676 >> >> Layout : left-symmetric >> Chunk Size : 512K >> >> Device Role : Active device 2 >> Array State : A.A. ('A' == active, '.' == missing) > >> /dev/sdd3: >> Magic : a92b4efc >> Version : 1.2 >> Feature Map : 0x0 >> Array UUID : cf9db8fa:0c2bb553:46865912:704cceae >> Name : runts:0 (local to host runts) >> Creation Time : Mon Jul 25 23:27:39 2011 >> Raid Level : raid5 >> Raid Devices : 4 >> >> Avail Dev Size : 3905591472 (1862.33 GiB 1999.66 GB) >> Array Size : 5858385408 (5586.99 GiB 5998.99 GB) >> Used Dev Size : 3905590272 (1862.33 GiB 1999.66 GB) >> Data Offset : 2048 sectors >> Super Offset : 8 sectors >> State : clean >> Device UUID : 4156ab46:bd42c10d:8565d5af:74856641 >> >> Update Time : Tue Dec 2 23:14:03 2014 >> Checksum : a126853f - correct >> Events : 3925672 >> >> Layout : left-symmetric >> Chunk Size : 512K >> >> Device Role : Active device 1 >> Array State : AAAA ('A' == active, '.' == missing) > > At least you have the previous data anyway, which should allow > reconstruction of the array. The device names have changed between your > two reports though, so I'd advise double-checking which is which before > proceeding. > > The reports indicate that the original array order (based on the device > role field) for the four devices was (using device UUIDs as they're > consistent): > 92589cc2:9d5ed86c:1467efc2:2e6b7f09 > 4156ab46:bd42c10d:8565d5af:74856641 > 390bd4a2:07a28c01:528ed41e:a9d0fcf0 > b2bf0462:e0722254:0e233a72:aa5df4da > > That would give a current device order of sdd3,sda3,sdc3,sdb3 (I don't > have the current data for sda3, but that's the only missing UUID). > > The create command would therefore be: > mdadm -C -l 5 -n 4 -c 512 -e 1.2 -z 1952795136 \ > /dev/md0 /dev/sdd3 /dev/sda3 /dev/sdc3 missing > > mdadm 3.2.3 should use a data offset of 2048, the same as your old > array, but you may want to double-check that with a test array on a > couple of loopback devices first. If not, you'll need to grab the > latest release and add the --data-offset=2048 parameter to the above > create command. > > You should also follow the instructions for using overlay files at > https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID > in order to safely test out the above without risking damage to the > array data. > > Once you've run the create, run a "fsck -n" on the filesystem to check > that the data looks okay. If not, the order or parameters may be > incorrect - check the --examine output for any differences from the > original results. > > Cheers, > Robin > -- > ___ > ( ' } | Robin Hill <robin@xxxxxxxxxxxxxxx> | > / / ) | Little Jim says .... | > // !! | "He fallen in de water !!" | -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html