Major problems after soft raid 5 failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Folks,

I'm writing you in the hopes that someone can give me some advise on some big problems I'm having with a 1.8TB LV. If this is the wrong place for this kind of question and you happen to know the right place to ask, please direct me there.

First let me explain what has happened and my setup.

I have (had) 2 software raid 5 arrays generated with mdadm on a 2.6.22 based system. Each array contained 3 disks - though the sets were of different sizes.

md0 (RAID 5):
/dev/sde1 (1TB)
/dev/sdf1 (1TB)
/dev/sdg1 (1TB)

md1 (RAID 5):
/dev/sda1 (750GB)
/dev/sdb1 (750GB)
/dev/sdc1 (750GB)

Initially I started out with a simple LVM called 'array' in the volume group 'raid' sitting on md0. Over time this disk was populated to 90% capacity.

So following the steps outlined on various how-to's through out the internet I managed to extend the volume group 'raid' and then extend the logical volume 'array' with the additional space. I then used resize2fs to resize the file system (while it was offline of course).

I then remounted the file system successfully and it had grown to roughly 3.1TB of usable space (great). After remounting I did a few simple test writes to it and copied a few ISO images over to make sure everything was working.

Well to my great luck I awoke this morning to find that md1 had degraded. Last night the second disk in the set (/dev/sdb1) threw a few sector errors (nothing critical - or so I thought). Examining /proc/mdstat indicated that the entire md1 array had failed. Both /dev/sdb1 and /dev/sdc1 were marked as failed and offline. This had me worried but I wasn't too concerned as i had not yet written any critical data to LVM (at least nothing I couldn't recover). So after messing around with md1 for nearly 2 hours trying to figure out why both disks fell out of the array (I have yet to determine why it booted /dev/sdc1 out - as there were no errors found on it, reported, etc). I decided that I would try and reboot to see if some thread was hung, something unexplained could be corrected by a restart. At this point things went from problematic to down right horrible.

As soon as the system came back online md1 was still no where to be found, md0 was there and still in tact. However because md1 was missing from the volume group, the volume group could not start and thus the logical volume was unavailable. After searching around I kept coming back to suggestions stating that removal of the missing device from the volume group was the solution to getting thing back online again. So using 'vgreduce --removemissing raid' then 'lvchange -ay raid' to update the changes - Neither command errored and vgreduce noted that 'raid' was not available again.

So as it stands now I have no logical volume, I have a volume group and I have a functional md0 array. If I dump the first 50 or so megs of the md0 raid array I can see the volume group information, as well as the lv information including various bits of file system information.

At this point I'm wondering can I recover the logical volume and recover this 1.8TB of data.

For completeness here is the results of various display and scan commands:

root@Aria:/dev/disk/by-id# pvscan
 PV /dev/md0   VG raid   lvm2 [1.82 TB / 1.82 TB free]
 Total: 1 [1.82 TB] / in use: 1 [1.82 TB] / in no VG: 0 [0   ]

root@Aria:/dev/disk/by-id# pvdisplay
 --- Physical volume ---
 PV Name               /dev/md0
 VG Name               raid
 PV Size               1.82 TB / not usable 2.25 MB
 Allocatable           yes
 PE Size (KByte)       4096
 Total PE              476933
 Free PE               476933
 Allocated PE          0
 PV UUID               oI1oXp-NOSk-BJn0-ncEN-HaZr-NwSn-P9De9b

root@Aria:/dev/disk/by-id# vgscan
 Reading all physical volumes.  This may take a while...
 Found volume group "raid" using metadata type lvm2

root@Aria:/dev/disk/by-id# vgdisplay
 --- Volume group ---
 VG Name               raid
 System ID
 Format                lvm2
 Metadata Areas        1
 Metadata Sequence No  11
 VG Access             read/write
 VG Status             resizable
 MAX LV                0
 Cur LV                0
 Open LV               0
 Max PV                0
 Cur PV                1
 Act PV                1
 VG Size               1.82 TB
 PE Size               4.00 MB
 Total PE              476933
 Alloc PE / Size       0 / 0
 Free  PE / Size       476933 / 1.82 TB
 VG UUID               quRohP-EcsI-iheW-lbU5-rBjO-TnqS-JbjmZA

root@Aria:/dev/disk/by-id# lvscan
root@Aria:/dev/disk/by-id#

root@Aria:/dev/disk/by-id# lvdisplay
root@Aria:/dev/disk/by-id#


Thank you.

-cf

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

[Index of Archives]     [Gluster Users]     [Kernel Development]     [Linux Clusters]     [Device Mapper]     [Security]     [Bugtraq]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]

  Powered by Linux