Question about RAID 5 array rebuild with mdadm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



I'm using Centos 4.5 right now, and I had a RAID 5 array stop because two drives became unavailable. After adjusting the cables on several occasions and shutting down and restarting, I was able to see the drives again. This is when I snatched defeat from the jaws of victory. Please, someone with vast knowledge of how RAID 5 with mdadm works, tell me if I have any chance at all that this array will pull through with most or all of my data.

Background info about the machine
/dev/md0 is a RAID1 consisting of /dev/sda1 and /dev/sda2
/dev/md1 is a RAID1 consisting of /dev/sda2 and /dev/sdb2
/dev/md2 (our special friend) is a RAID5 consisting of /dev/sd[c-j]

/dev/sdi and /dev/sdj were the drives that detached from the array and were marked as faulty.

I did the following things that in hindsight were probably VERY BAD

Step 1 (Misassign drives to wrong array):
I could probably have had things going again in a tenth of a second if I hadn't typed this:
mdadm --manage --add /dev/md0 /dev/sdi
mdadm --manage --add /dev/md0 /dev/sdi

This clobbered the superblock and replaced it with that of /dev/md0, yes?
well, that's what mdadm --misc --examine /dev/sdi and sdj said anyhow.

Ok, so what next?
Step 2 (rebuild the array but make sure the params are right!):
I wipe out the superblocks on all of the drives in the array and rebuild with --assume-clean
for i in c d e f g h i j ; do mdadm --zero-superblock /dev/sd$i ; done
mdadm --create /dev/md2 --assume-clean --level=5 --raid-devices=8 /dev/ sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj

ok, now it says that the array is recovering and will take about 10 hours to rebulid. /dev/sd[c-i] say that they are "active sync" and /dev/sdj says it's a spare that's rebuilding. But now I scroll back in my history and see that oops, the chunk size is WRONG. Not only that, but I don't stop the array until the rebuild is at around 8%

Ok, I stop the array and rebuild with
mdadm --create /dev/md2 --assume-clean --level=5 --chunk --raid- devices=8 /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/ sdi /dev/sdj

Now it says it's going to take another 10 hours to rebuild.

How likely are my data irretrievable/gone and at what step would it have happened if so?
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos

[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux