I have managed to pickle my RAID 1 install after a disk crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I recently had the mis-fortune of a disk failure, luckily the disk was part of a RAID1 setup, so nothing was lost - yet...I should mention that I am mirroring /boot/, swap and / , and am using mdadm.

I had thought that I had setup grub correctlly to allow booting off of either disk, but did not test it - my bad. So when I replaced the drive - hda- I thought that the sytem would boot off of hdd and then go through the process of rebuilding the array with the new drive. But the system would not boot. I tried various things, BIOS settings, etc, but the Grub splash screen would not appear when I tried to boot off of hdd.

I swapped cables - and drive jumpers - so that my previous hdd was now hda and then sucessfully re-booted the system. So far so good. Not sure why the disk would boot as hda and not hdd - maybe a BIOS issue with my motherboard even though it does allow specifying IDE 0-4 as boot devices.

So I had the sytem up and running - in a RAID degraded state - and started woking on bringing the RAID 1 scenario back. I partioned the replacement drive, now hdd and all looked well. It didn't look like I could simply add the drive to the array as cat /proc/mdstat implied to me that the first disk in the array had failed and I was worried about copying the contents of the second drive - which mdadm thought was good - over the drive that actually had the good stuff on it. I tried various other things with mdadm, like stoopping and re-creating the raid devices, etc, but to no success - probably user eror.

So now I am not sure how to proceed.

cat /proc/mdstat yeilds this:

Code:
lucky root # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5] [multipath]
read_ahead 1024 sectors
md3 : active raid1 ide/host0/bus0/target0/lun0/part3[1]
      12377984 blocks [2/1] [_U]

md1 : active raid1 ide/host0/bus1/target1/lun0/part1[1]
      64128 blocks [2/1] [_U]

md2 : active raid1 ide/host0/bus1/target1/lun0/part2[1]
      248896 blocks [2/1] [_U]

unused devices: <none>


Which implies to me that things are very messed up. I think this because of the following snippets:


Code:
md3 : active raid1 ide/host0/bus0/target0/lun0/part3[1]
      12377984 blocks [2/1] [_U]

and

Code:
md1 : active raid1 ide/host0/bus1/target1/lun0/part1[1]
      64128 blocks [2/1] [_U]


different busses and targets - so different disks are active....

I spent some time seraching around and came to the conclusion that my RAID config is definitely borked.

I am thinking that the best thing for me to do now is to deactivate RAID completely, then come back and do a complete RAID re-config with my disks the way they are. But, I can't find a way to stop/delete the meta devices so that I can start from scratch. I am running on my /dev/hdax config with no /dev/mdx devices mounted.

Any thoughts ?

Thanx.
--
Dave Dmytriw
Principal, NetCetera Solutions Inc.
Calgary, AB
403-703-1399
daved@xxxxxxxxxxxxxxxxxxxxxxx
http://www.netcetera-solutions.com

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux