On Thu, Dec 17, 2009 at 9:40 AM, Majed B. <majedb@xxxxxxxxx> wrote: > I'm assuming you ran the command with the 2 external disks added to the array. > One question before proceeding: When you removed these 2 externals, > were there any changes on the array? Did you add/delete/modify any > files or rename them? shutdown the box, unplugged drives, booted box. > > What do you mean the 2 externals have had mkfs run on them? Is this > AFTER you removed the disks from the array? If so, they're useless > now. That's what I figured. > > The names of the disks have changed and their names in the superblock > are different than what udev is reporting them: > sde now was named sdg > sdf is sdf > sdb is sdb > sdc is sdc > sdd is sdd > > According to the listing above, you have superblock info on: sdb, sdc, > sdd, sde, sdf; 5 disks out of 7 -- one of which is a spare. > sdb was a spare and according to other disks' info, it didn't resync > so it has no useful data to aid in recovery. > So you're left with 4 out of 6 disks + 1 spare. > > You have a chance of running the array in degraded mode using sde, > sdc, sdd, sdf, assuming these disks are sane. > > Try running this command: mdadm -Af /dev/md0 /dev/sde /dev/sdc /dev/sdd /dev/sdf mdadm: forcing event count in /dev/sdf(1) from 97276 upto 580158 mdadm: /dev/md0 has been started with 4 drives (out of 6). > > then check: cat /proc/mdstat root@dhcp128:~# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid6 sdf[1] sde[5] sdd[3] sdc[2] 5860549632 blocks level 6, 64k chunk, algorithm 2 [6/4] [_UUU_U] unused devices: <none> > > If the remaining disks are sane, it should run the array in degraded > mode. Hopefully. dmesg [31828.093953] md: md0 stopped. [31838.929607] md: bind<sdc> [31838.931455] md: bind<sdd> [31838.932073] md: bind<sde> [31838.932376] md: bind<sdf> [31838.973346] raid5: device sdf operational as raid disk 1 [31838.973349] raid5: device sde operational as raid disk 5 [31838.973351] raid5: device sdd operational as raid disk 3 [31838.973353] raid5: device sdc operational as raid disk 2 [31838.973787] raid5: allocated 6307kB for md0 [31838.974165] raid5: raid level 6 set md0 active with 4 out of 6 devices, algorithm 2 [31839.066014] RAID5 conf printout: [31839.066016] --- rd:6 wd:4 [31839.066018] disk 1, o:1, dev:sdf [31839.066020] disk 2, o:1, dev:sdc [31839.066022] disk 3, o:1, dev:sdd [31839.066024] disk 5, o:1, dev:sde [31839.066066] md0: detected capacity change from 0 to 6001202823168 [31839.066188] md0: p1 root@dhcp128:/media# fdisk -l /dev/md0 Disk /dev/md0: 6001.2 GB, 6001202823168 bytes 255 heads, 63 sectors/track, 729604 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x96af0591 Device Boot Start End Blocks Id System /dev/md0p1 1 182401 1465136001 83 Linux and now the bad news: mount /dev/md0p1 md0p1 mount: wrong fs type, bad option, bad superblock on /dev/md0p1 [32359.038796] raid5: Disk failure on sde, disabling device. [32359.038797] raid5: Operation continuing on 3 devices. > > If that doesn't work, I'd say you're better off scrapping & restoring > your data back onto a new array rather than waste more time fiddling > with superblocks. Yep. starting that now. This is exactly what I was expecting - very few things to try (like 1) and a very clear pass/fail test. Thanks for helping me get though this. > > On Thu, Dec 17, 2009 at 6:06 PM, Carl Karsten <carl@xxxxxxxxxxxxxxxxx> wrote: >> I brought back the 2 externals, which have had mkfs run on them, but >> maybe the extra superblocks will help (doubt it, but couldn't hurt) >> >> root@dhcp128:/media# mdadm -E /dev/sd[a-z] >> mdadm: No md superblock detected on /dev/sda. >> /dev/sdb: >> Magic : a92b4efc >> Version : 00.90.00 >> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b >> Creation Time : Wed Mar 25 21:04:08 2009 >> Raid Level : raid6 >> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB) >> Array Size : 5860549632 (5589.06 GiB 6001.20 GB) >> Raid Devices : 6 >> Total Devices : 6 >> Preferred Minor : 0 >> >> Update Time : Tue Mar 31 23:08:02 2009 >> State : clean >> Active Devices : 5 >> Working Devices : 6 >> Failed Devices : 1 >> Spare Devices : 1 >> Checksum : a4fbb93a - correct >> Events : 8430 >> >> Chunk Size : 64K >> >> Number Major Minor RaidDevice State >> this 6 8 16 6 spare /dev/sdb >> >> 0 0 8 0 0 active sync /dev/sda >> 1 1 8 64 1 active sync /dev/sde >> 2 2 8 32 2 active sync /dev/sdc >> 3 3 8 48 3 active sync /dev/sdd >> 4 4 0 0 4 faulty removed >> 5 5 8 80 5 active sync /dev/sdf >> 6 6 8 16 6 spare /dev/sdb >> /dev/sdc: >> Magic : a92b4efc >> Version : 00.90.00 >> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b >> Creation Time : Wed Mar 25 21:04:08 2009 >> Raid Level : raid6 >> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB) >> Array Size : 5860549632 (5589.06 GiB 6001.20 GB) >> Raid Devices : 6 >> Total Devices : 4 >> Preferred Minor : 0 >> >> Update Time : Sun Jul 12 11:31:47 2009 >> State : clean >> Active Devices : 4 >> Working Devices : 4 >> Failed Devices : 2 >> Spare Devices : 0 >> Checksum : a59452db - correct >> Events : 580158 >> >> Chunk Size : 64K >> >> Number Major Minor RaidDevice State >> this 2 8 32 2 active sync /dev/sdc >> >> 0 0 8 0 0 active sync /dev/sda >> 1 1 0 0 1 faulty removed >> 2 2 8 32 2 active sync /dev/sdc >> 3 3 8 48 3 active sync /dev/sdd >> 4 4 0 0 4 faulty removed >> 5 5 8 96 5 active sync /dev/sdg >> /dev/sdd: >> Magic : a92b4efc >> Version : 00.90.00 >> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b >> Creation Time : Wed Mar 25 21:04:08 2009 >> Raid Level : raid6 >> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB) >> Array Size : 5860549632 (5589.06 GiB 6001.20 GB) >> Raid Devices : 6 >> Total Devices : 4 >> Preferred Minor : 0 >> >> Update Time : Sun Jul 12 11:31:47 2009 >> State : clean >> Active Devices : 4 >> Working Devices : 4 >> Failed Devices : 2 >> Spare Devices : 0 >> Checksum : a59452ed - correct >> Events : 580158 >> >> Chunk Size : 64K >> >> Number Major Minor RaidDevice State >> this 3 8 48 3 active sync /dev/sdd >> >> 0 0 8 0 0 active sync /dev/sda >> 1 1 0 0 1 faulty removed >> 2 2 8 32 2 active sync /dev/sdc >> 3 3 8 48 3 active sync /dev/sdd >> 4 4 0 0 4 faulty removed >> 5 5 8 96 5 active sync /dev/sdg >> /dev/sde: >> Magic : a92b4efc >> Version : 00.90.00 >> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b >> Creation Time : Wed Mar 25 21:04:08 2009 >> Raid Level : raid6 >> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB) >> Array Size : 5860549632 (5589.06 GiB 6001.20 GB) >> Raid Devices : 6 >> Total Devices : 4 >> Preferred Minor : 0 >> >> Update Time : Sun Jul 12 11:31:47 2009 >> State : clean >> Active Devices : 4 >> Working Devices : 4 >> Failed Devices : 2 >> Spare Devices : 0 >> Checksum : a5945321 - correct >> Events : 580158 >> >> Chunk Size : 64K >> >> Number Major Minor RaidDevice State >> this 5 8 96 5 active sync /dev/sdg >> >> 0 0 8 0 0 active sync /dev/sda >> 1 1 0 0 1 faulty removed >> 2 2 8 32 2 active sync /dev/sdc >> 3 3 8 48 3 active sync /dev/sdd >> 4 4 0 0 4 faulty removed >> 5 5 8 96 5 active sync /dev/sdg >> /dev/sdf: >> Magic : a92b4efc >> Version : 00.90.00 >> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b >> Creation Time : Wed Mar 25 21:04:08 2009 >> Raid Level : raid6 >> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB) >> Array Size : 5860549632 (5589.06 GiB 6001.20 GB) >> Raid Devices : 6 >> Total Devices : 5 >> Preferred Minor : 0 >> >> Update Time : Wed Apr 8 11:13:32 2009 >> State : clean >> Active Devices : 5 >> Working Devices : 5 >> Failed Devices : 1 >> Spare Devices : 0 >> Checksum : a5085415 - correct >> Events : 97276 >> >> Chunk Size : 64K >> >> Number Major Minor RaidDevice State >> this 1 8 80 1 active sync /dev/sdf >> >> 0 0 8 0 0 active sync /dev/sda >> 1 1 8 80 1 active sync /dev/sdf >> 2 2 8 32 2 active sync /dev/sdc >> 3 3 8 48 3 active sync /dev/sdd >> 4 4 0 0 4 faulty removed >> 5 5 8 96 5 active sync /dev/sdg >> mdadm: No md superblock detected on /dev/sdg. >> >> >> >> On Thu, Dec 17, 2009 at 8:39 AM, Majed B. <majedb@xxxxxxxxx> wrote: >>> You can't copy and change bytes to identify disks. >>> >>> To check which disks belong to an array, do this: >>> mdadm -E /dev/sd[a-z] >>> >>> The disks that you get info from belong to the existing array(s). >>> >>> In the first email you sent you included an examine output for one of >>> the disks that listed another disk as a spare (sdb). The output of >>> examine should shed more light. >>> >>> On Thu, Dec 17, 2009 at 5:15 PM, Carl Karsten <carl@xxxxxxxxxxxxxxxxx> wrote: >>>> On Thu, Dec 17, 2009 at 4:35 AM, Majed B. <majedb@xxxxxxxxx> wrote: >>>>> I have misread the information you've provided, so allow me to correct myself: >>>>> >>>>> You're running a RAID6 array, with 2 disks lost/failed. Any disk loss >>>>> after that will cause data loss since you have no redundancy (2 disks >>>>> died). >>>> >>>> right - but I am not sure if data loss has occurred, where data is the >>>> data being stored on the raid, not the raid metadata. >>>> >>>> My guess is I need to copy the raid superblock from one of the other >>>> disks (say sdb), find the byets that identify the disk and change from >>>> sdb to sda. >>>> >>>>> >>>>> I believe it's still possible to reassemble the array, but you only >>>>> need to remove the MBR. See this page for information: >>>>> http://www.cyberciti.biz/faq/linux-how-to-uninstall-grub/ >>>>> dd if=/dev/null of=/dev/sdX bs=446 count=1 >>>>> >>>>> Before proceeding, provide the output of cat /proc/mdstat >>>> >>>> root@dhcp128:~# cat /proc/mdstat >>>> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] >>>> [raid4] [raid10] >>>> unused devices: <none> >>>> >>>> >>>>> Is the array currently running degraded or is it suspended? >>>> >>>> um, not running, not sure I would call it suspended. >>>> >>>>> What happened to the spare disk assigned? >>>> >>>> I don't understand. >>>> >>>>> Did it finish resyncing >>>>> before you installed grub on the wrong disk? >>>> >>>> I think so. >>>> >>>> I am fairly sure I could assemble the array before I installed grub. >>>> >>>>> >>>>> On Thu, Dec 17, 2009 at 8:21 AM, Majed B. <majedb@xxxxxxxxx> wrote: >>>>>> If your other disks are sane and you are able to run a degraded array, then >>>>>> you can remove grub using dd then re-add the disk to the array. >>>>>> >>>>>> To clear the first 1MB of the disk: >>>>>> dd if=/dev/zero of=/dev/sdx bs=1M count=1 >>>>>> Replace sdx with the disk name that has grub. >>>>>> >>>>>> On Dec 17, 2009 6:53 AM, "Carl Karsten" <carl@xxxxxxxxxxxxxxxxx> wrote: >>>>>> >>>>>> I took over a box that had 1 ide boot drive, 6 sata raid drives (4 >>>>>> internal, 2 external.) I believe the 2 externals were redundant, so >>>>>> could be removed. so I did, and mkfs-ed them. then I installed >>>>>> ubuntu to the ide, and installed grub to sda, which turns out to be >>>>>> the first sata. which would be fine if the raid was on sda1, but it >>>>>> is on sda, and now the raid wont' assemble. no surprise, and I do >>>>>> have a backup of the data spread across 5 external drives. but before >>>>>> I abandon the array, I am wondering if I can fix it by recreating >>>>>> mdadm's metatdata on sda, given I have sd[bcd] to work with. >>>>>> >>>>>> any suggestions? >>>>>> >>>>>> root@dhcp128:~# mdadm --examine /dev/sd[abcd] >>>>>> mdadm: No md superblock detected on /dev/sda. >>>>>> /dev/sdb: >>>>>> Magic : a92b4efc >>>>>> Version : 00.90.00 >>>>>> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b >>>>>> Creation Time : Wed Mar 25 21:04:08 2009 >>>>>> Raid Level : raid6 >>>>>> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB) >>>>>> Array Size : 5860549632 (5589.06 GiB 6001.20 GB) >>>>>> Raid Devices : 6 >>>>>> Total Devices : 6 >>>>>> Preferred Minor : 0 >>>>>> >>>>>> Update Time : Tue Mar 31 23:08:02 2009 >>>>>> State : clean >>>>>> Active Devices : 5 >>>>>> Working Devices : 6 >>>>>> Failed Devices : 1 >>>>>> Spare Devices : 1 >>>>>> Checksum : a4fbb93a - correct >>>>>> Events : 8430 >>>>>> >>>>>> Chunk Size : 64K >>>>>> >>>>>> Number Major Minor RaidDevice State >>>>>> this 6 8 16 6 spare /dev/sdb >>>>>> >>>>>> 0 0 8 0 0 active sync /dev/sda >>>>>> 1 1 8 64 1 active sync /dev/sde >>>>>> 2 2 8 32 2 active sync /dev/sdc >>>>>> 3 3 8 48 3 active sync /dev/sdd >>>>>> 4 4 0 0 4 faulty removed >>>>>> 5 5 8 80 5 active sync >>>>>> 6 6 8 16 6 spare /dev/sdb >>>>>> /dev/sdc: >>>>>> Magic : a92b4efc >>>>>> Version : 00.90.00 >>>>>> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b >>>>>> Creation Time : Wed Mar 25 21:04:08 2009 >>>>>> Raid Level : raid6 >>>>>> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB) >>>>>> Array Size : 5860549632 (5589.06 GiB 6001.20 GB) >>>>>> Raid Devices : 6 >>>>>> Total Devices : 4 >>>>>> Preferred Minor : 0 >>>>>> >>>>>> Update Time : Sun Jul 12 11:31:47 2009 >>>>>> State : clean >>>>>> Active Devices : 4 >>>>>> Working Devices : 4 >>>>>> Failed Devices : 2 >>>>>> Spare Devices : 0 >>>>>> Checksum : a59452db - correct >>>>>> Events : 580158 >>>>>> >>>>>> Chunk Size : 64K >>>>>> >>>>>> Number Major Minor RaidDevice State >>>>>> this 2 8 32 2 active sync /dev/sdc >>>>>> >>>>>> 0 0 8 0 0 active sync /dev/sda >>>>>> 1 1 0 0 1 faulty removed >>>>>> 2 2 8 32 2 active sync /dev/sdc >>>>>> 3 3 8 48 3 active sync /dev/sdd >>>>>> 4 4 0 0 4 faulty removed >>>>>> 5 5 8 96 5 active sync >>>>>> /dev/sdd: >>>>>> Magic : a92b4efc >>>>>> Version : 00.90.00 >>>>>> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b >>>>>> Creation Time : Wed Mar 25 21:04:08 2009 >>>>>> Raid Level : raid6 >>>>>> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB) >>>>>> Array Size : 5860549632 (5589.06 GiB 6001.20 GB) >>>>>> Raid Devices : 6 >>>>>> Total Devices : 4 >>>>>> Preferred Minor : 0 >>>>>> >>>>>> Update Time : Sun Jul 12 11:31:47 2009 >>>>>> State : clean >>>>>> Active Devices : 4 >>>>>> Working Devices : 4 >>>>>> Failed Devices : 2 >>>>>> Spare Devices : 0 >>>>>> Checksum : a59452ed - correct >>>>>> Events : 580158 >>>>>> >>>>>> Chunk Size : 64K >>>>>> >>>>>> Number Major Minor RaidDevice State >>>>>> this 3 8 48 3 active sync /dev/sdd >>>>>> >>>>>> 0 0 8 0 0 active sync /dev/sda >>>>>> 1 1 0 0 1 faulty removed >>>>>> 2 2 8 32 2 active sync /dev/sdc >>>>>> 3 3 8 48 3 active sync /dev/sdd >>>>>> 4 4 0 0 4 faulty removed >>>>>> 5 5 8 96 5 active sync >>>>>> >>>>>> -- >>>>>> Carl K >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Majed B. >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Carl K >>>> >>> >>> >>> >>> -- >>> Majed B. >>> >>> >> >> >> >> -- >> Carl K >> > > > > -- > Majed B. > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- Carl K -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html