Re: Incorrect array member slot assignment during assemble

jay@xxxxxxxxxxxxxxxxxxxxxxxxx · Tue, 22 Apr 2008 11:47:22 +0000

Quoting Carlos Carvalho <carlos@xxxxxxxxxxxxxx>:

jay@xxxxxxxxxxxxxxxxxxxxxxxxx (jay@xxxxxxxxxxxxxxxxxxxxxxxxx) wrote   
on 21 April 2008 16:20:
 >Quoting Carlos Carvalho <carlos@xxxxxxxxxxxxxx>:
 >
 >> jay@xxxxxxxxxxxxxxxxxxxxxxxxx (jay@xxxxxxxxxxxxxxxxxxxxxxxxx) wrote
 >> on 20 April 2008 18:35:
 >>  >I had a single disk failure in a 3-disk RAID5 array recently, and have
 >>  >been trying to reassemble the array with the remaining devices, but am
 >>  >running into some issues.
 >>  >
 >>  >The failed disk died during synchronization into the array,
 >>
 >> Did another one fail during the resync? If not you can try to
 >> reassemble the array with only the 2 good disks using --force.
 >
 >The array did fall offline shortly after the first failure, which
 >seemed unexpected; the two remaining disks (and controller) still seem
 >healthy (states 'clean' and 'active' respectively), just that they
 >refuse to assemble.

You should check the logs to see the cause.

Unfortunately all that was recorded was a slightly nebulous 'read  
error' before the array dropped - I suspect the controller might have  
flaked out after the first drive failure..

 >> If a second disk failed you're in trouble. Since the array stopped at
 >> the moment of the second failure, the two disks are still in sync.
 >> Well, almost... You can then make an image of the second failed disk
 >> on a good one and use --force to reassemble again. Then fsck...
 >
 >This sounds good - the only filesystem mounted over the devices was
 >read-only at the time, so I'm hoping that the two good disks should
 >still be enough for some data recovery.
 >
 >The only problem I see is that if I make a raw image copy of the
 >second disk, it will still have the incorrect 'slot' assignment in the
 >superblock.  I suppose I could dd everything except the superblock -
 >but is there a mechanism to repair/recreate the superblock on a raw
 >disk?

No, just assemble with --force.

This is what I'd been trying prior to posting; but, from my original message:

mdadm -Afv /dev/md0 /dev/sdc /dev/sdd
mdadm: looking for devices for /dev/md0
mdadm: /dev/sdc is identified as a member of /dev/md0, slot -1.
mdadm: /dev/sdd is identified as a member of /dev/md0, slot 0.
... *snip* ...
mdadm: /dev/md0 assembled from 1 drive - not enough to start the array.

The assemble (even with --force/-f) just fails, I think due to the  
'-1' slot assignment for one of the drives in the pair.

 >The other idea I have in mind is to - after a backup - recreate the
 >array using the initial configuration (raid-level 5, num-devices 3,
 >etc), and hope that the array can pick itself up again.
 >
 >Any thoughts much appreciated - thanks for helping out :)

This is the last alternative, when you're sure the 2 disks are fine.
Then just re-create the array replacing the /dev/3rd-drive by the
word "missing". This won't change the data, it'll just re-write the
superblocks. Then do read-only fsck

I think I'll be giving this a go - I'll be taking full image copies  
and attempting a recreate.

Cheers,
Jay

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html