Re: Problem recovering failed Intel Rapid Storage raid5 volume

NeilBrown <neilb@xxxxxxx> · Mon, 23 Jul 2012 09:08:56 +1000

On Sat, 21 Jul 2012 21:00:19 +0500 Khurram Hassan <kfhassan@xxxxxxxxx> wrote:

> I have this 3 disk raid5 volumne on an Asus motherboard sporting an
> Intel Rapid Storage chipset. The problem began when I noticed in
> windows that one of the hard disks (the first one in the array) was
> marked as failed in the Intel raid utility. I shutdown the system to
> remove the hard disk and removed the cables for the faulty hard disk.
> But I made a mistake and remove the cables for one of the working hard
> disks. So when I booted, it showed the raid volume as failed. I
> quickly shutdown the system and corrected the mistake. But it
> completely hosed my raid volume. When I booted the system up again,
> both of the remaining 2 hard disks were showed as offline.
> 
> I read the raid recovery section in the wiki and installed ubuntu
> 12.04 on a separate non-raid hard disk (after completely disconnecting
> the offline raid5 volume). Then I reconnected the 2 hard disks and
> booted ubuntu. Then I gave the following commands:
> 
> 1) mdadm --examine /dev/sd[bc] > raid.status
> 2) mdadm --create --assume-clean -c 128 --level=5 --raid-devices=3
> /dev/md1 missing /dev/sdb /dev/sdc
> 
> It gave the following output:
>     mdadm: /dev/sdb appears to be part of a raid array:
>         level=container devices=0 ctime=Thu Jan  1 05:00:00 1970
>     mdadm: /dev/sdc appears to be part of a raid array:
>         level=container devices=0 ctime=Thu Jan  1 05:00:00 1970
>     Continue creating array? y
>     mdadm: Defaulting to version 1.2 metadata
>     mdadm: array /dev/md1 started.
> 
> But the raid volume is not accessible. mdadm --examine /dev/md1 gives:
> 
>     mdadm: No md superblock detected on /dev/md1.
> 
> Worse, upon booting the system, the raid chipset message says the 2
> hard disk are non-raid hard disks. Have I completely messed up the
> raid volume? Is it not recoverable at all?

Possibly :-(

You had an array with Intel-specific metadata.  This metadata is stored at
the end of the device.

When you tried to "--create" the array, you did not ask for intel metadata so
you got the default v1.2 metadata.  This metadata is stored at the beginning
of the device (a 1K block, 4K from the start).
So this would have over-written a small amount of filesystem data.

Also when you --create an array, mdadm erases any other metadata that it
finds to avoid confusion.  So it will have erased the Intel metadata from the
end.

Your best hope is to recreate the array correctly with intel metadata.  The
filesystem will quite possibly be corrupted, but you might get some or even
all of your data back.

Can you post the "raid.status".  That would help be certain we are doing the
right thing.
Something like
  mdadm --create /dev/md/imsm -e imsm -n 3 missing /dev/sdb /dev/sdc
  mdadm --create /dev/md1 -c 128 -l 5 -n 3 /dev/md/imsm

might do it  ... or might not.  I'm not sure about creating imsm arrays with
missing devices.  Maybe you still list the 3 devices rather than just the
container.  I'd need to experiment.  If you post the raid.status I'll see if
I can work out the best way forward.

NeilBrown

Attachment:
signature.asc

Description: PGP signature