Failed Raid 5 due to OS mbr written on one of the array drives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm looking for advices regarding a failed (2 disc down) Raid 5 array
with 4 disks. It was running in a NAS for quite some time but after I
came back from a trip, I found out that the system disk was dead.
After replacing the drive, I reinstalled the OS (OpenMediaVault for
the curious). Sadly the mbr was written to one of the raid disk
instead of the OS one. This would not have been too critical if, after
booting the system, I didn't realized that the array was already
running in degraded mode prior to the OS disk problem. Luckily I have
a backup for most of the critical data on that array. There is nothing
that I cannot replace, but it would still be quite inconvenient. I
guess it's a good opportunity to learn more about raid :)

 mdadm --stop /dev/md0 and removing the boot flag from the wrongly
written raid disk are the only stuff that I did so far.

Here are the informations I gathered :


root@NAStradamus:~# uname -a
Linux NAStradamus 3.2.0-4-amd64 #1 SMP Debian 3.2.68-1+deb7u5 x86_64 GNU/Linux


root@NAStradamus:~# mdadm --examine /dev/sd[a-z]1

mdadm: No md superblock detected on /dev/sda1.

/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : a08bcee5:9fb42352:319ecab9:53d6277b
           Name : ArchliNAS:0
  Creation Time : Sun Jul  8 17:59:49 2012
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 1953005569 (931.27 GiB 999.94 GB)
     Array Size : 2929507584 (2793.80 GiB 2999.82 GB)
  Used Dev Size : 1953005056 (931.27 GiB 999.94 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 9b28a04c:f2c3d6c9:6f76859d:624927f0

    Update Time : Wed Sep 16 19:32:12 2015
       Checksum : 4fb98985 - correct
         Events : 108016

         Layout : left-symmetric
     Chunk Size : 128K

   Device Role : Active device 1
   Array State : AAA. ('A' == active, '.' == missing)

/dev/sdc1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : a08bcee5:9fb42352:319ecab9:53d6277b
           Name : ArchliNAS:0
  Creation Time : Sun Jul  8 17:59:49 2012
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 1953005569 (931.27 GiB 999.94 GB)
     Array Size : 2929507584 (2793.80 GiB 2999.82 GB)
  Used Dev Size : 1953005056 (931.27 GiB 999.94 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 1952e898:50043e66:8247a64d:72ffb6c0

    Update Time : Wed Sep 16 19:32:12 2015
       Checksum : b9197a85 - correct
         Events : 108016

         Layout : left-symmetric
     Chunk Size : 128K

   Device Role : Active device 2
   Array State : AAA. ('A' == active, '.' == missing)


root@NAStradamus:~# mdadm --examine /dev/sdd
/dev/sdd:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : a08bcee5:9fb42352:319ecab9:53d6277b
           Name : ArchliNAS:0
  Creation Time : Sun Jul  8 17:59:49 2012
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 1953523120 (931.51 GiB 1000.20 GB)
     Array Size : 2929507584 (2793.80 GiB 2999.82 GB)
  Used Dev Size : 1953005056 (931.27 GiB 999.94 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 425c85ec:0c038b4b:cd59b4b5:280bf233

    Update Time : Mon May  4 08:18:04 2015
       Checksum : cca22997 - correct
         Events : 55532

         Layout : left-symmetric
     Chunk Size : 128K

   Device Role : Active device 3
   Array State : AAAA ('A' == active, '.' == missing)


root@NAStradamus:~# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : inactive sdb1[1] sdc1[3]
      1953005569 blocks super 1.2

unused devices: <none>


root@NAStradamus:~# mdadm --examine --scan
ARRAY /dev/md/0 metadata=1.2 UUID=a08bcee5:9fb42352:319ecab9:53d6277b
name=ArchliNAS:0


root@NAStradamus:~# fdisk -l

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
107 heads, 58 sectors/track, 314780 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00048269

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048  1953269760   976633856+  da  Non-FS data

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1            2048  1953269760   976633856+  da  Non-FS data

Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1            2048  1953269760   976633856+  da  Non-FS data

Disk /dev/sdd: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdd doesn't contain a valid partition table





First off we can see that /dev/sdd is in a weird state. You need to
know that it was added later in order to grow the array. It seems that
I was drunk when I did that as I didn't put any partition on that
disk…
Besides the event count is way off. My understanding is that there is
nothing to be done here, apart from re-adding it to the array once it
is up and running again (hopefully) and once the disk has been checked
for error and reinitialized (with a proper partition this time !).

Now, /dev/sda is more interesting. The partition is still present and
looks intact, it seems like it's just missing the superblock because
of the mbr shenanigan. Also the two healthy drives still see it as
active.

After looking around on the internet, I found people suggesting to
re-create the raid. It seems a bit extreme to me, but I cannot find
any other solution…
Luckily I saved the original command used to create this array. Here
is the one I think would be relevant in this case :

mdadm --create --verbose --assume-clean /dev/md0 --level=5
--metadata=1.2 --chunk=128 --raid-devices=4 /dev/sda1 /dev/sdb1
/dev/sdc1 missing /dev/sdd

This would be followed by a backup, the re-addition of /dev/sdd and a
migration to raid 6 with 2 more disks.
The wiki advise to get an experienced person review the measures
you're about to take, I don't know anybody experienced in RAID, hence
this e-mail :) What do you think ?

Please, CC me on the answers/comments posted to the list in response
to this; I'm not subscribed to the mailing list.

Thanks in advance for your time !

Nico
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux