Restore LVM2 RAID-1 without Reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi there,

I have set up a RAID-1 between two PVs on two different hard disks in a RHEL-6 environment.
I am having a problem restoring the broken mirror without rebooting the OS.

Here is how to reproduce my problem.
1. First of all, we have set up a raid-1 LV that looks like this:
[root@shi-rhel63 home]# lvs -a -o +seg_pe_ranges|grep home
  lv_home             vg_root rwi-aom-  1.50g                             100.00         lv_home_rimage_0:0-47 lv_home_rimage_1:0-47  
  [lv_home_rimage_0]  vg_root iwi-aor-  1.50g                                            /dev/sda2:192-239                            
  [lv_home_rimage_1]  vg_root iwi-aor-  1.50g                                            /dev/sdb2:34-81                              
  [lv_home_rmeta_0]   vg_root ewi-aor- 32.00m                                            /dev/sda2:177-177                            
  [lv_home_rmeta_1]   vg_root ewi-aor- 32.00m                                            /dev/sdb2:33-33                              
[root@shi-rhel63 home]# pvs
  PV         VG      Fmt  Attr PSize  PFree  
  /dev/sda2  vg_root lvm2 a--  17.84g 224.00m
  /dev/sdb2  vg_root lvm2 a--  18.84g   1.22g

I can test the mirror by performing a dd write and watch the iostat on both physical disks:
[root@shi-rhel63 home]# dd if=/dev/zero of=test bs=1M count=10000 &
[1] 23388
[root@shi-rhel63 home]# iostat -x 1 sda sdb -m
Linux 2.6.32-279.el6.x86_64 (shi-rhel63) 06/11/13 _x86_64_ (1 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.06    0.07    0.35    0.32    0.00   99.20

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.05     0.78    0.38    0.14     0.00     0.00    22.14     0.01   11.07   5.63   0.29
sdb               0.11     0.79    0.14    0.23     0.00     0.00    27.94     0.00    5.60   3.71   0.14

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    7.14   92.86    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  5702.04    1.02   56.12     0.01    22.46   805.25    50.75  911.66  17.86 102.04
sdb               0.00  5173.47    0.00   71.43     0.00    20.42   585.43     1.21   16.97   3.44  24.59

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    8.08   91.92    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  5131.31    0.00   52.53     0.00    19.71   768.50    51.39  894.83  19.23 101.01
sdb               0.00  5121.21    0.00   69.70     0.00    18.20   534.70     1.15   15.96   3.41  23.74

As you may see, both sda and sdb gets similar write IOs so the miror is in fact working.

2. Now I am going to simulate a disk failure on sdb by removing the disk (I use VMware so it is very easy). Of course, the mirror is now broken as show below:

[root@shi-rhel63 home]# lvs -a -o +seg_pe_ranges|grep home
  Couldn't find device with uuid pnsMYs-Ce4t-9KYR-3Zfs-GItC-k5SZ-VidQ30.
  lv_home             vg_root rwi-aom-  1.50g                             100.00         lv_home_rimage_0:0-47 lv_home_rimage_1:0-47  
  [lv_home_rimage_0]  vg_root iwi-aor-  1.50g                                            /dev/sda2:192-239                            
  [lv_home_rimage_1]  vg_root iwi-aor-  1.50g                                            unknown device:34-81                         
  [lv_home_rmeta_0]   vg_root ewi-aor- 32.00m                                            /dev/sda2:177-177                            
  [lv_home_rmeta_1]   vg_root ewi-aor- 32.00m                                            unknown device:33-33                         
[root@shi-rhel63 home]# pvs
  Couldn't find device with uuid pnsMYs-Ce4t-9KYR-3Zfs-GItC-k5SZ-VidQ30.
  PV             VG      Fmt  Attr PSize  PFree  
  /dev/sda2      vg_root lvm2 a--  17.84g 224.00m
  unknown device vg_root lvm2 a-m  18.84g   1.22g

I already have a slight problem here since the mirror status above still shows 100% Copy% but I accept it as a minor presentation issue.

3. Now I put the moved disk back and I would like to have a way to incrementally resync the difference from the point where the mirror is broken. Note that the same disk now shows up as sdc

[root@shi-rhel63 home]# pvs
  PV         VG      Fmt  Attr PSize  PFree  
  /dev/sda2  vg_root lvm2 a--  17.84g 224.00m
  /dev/sdc2  vg_root lvm2 a--  18.84g   1.22g
[root@shi-rhel63 home]# lvs -a -o +seg_pe_ranges|grep home
  lv_home             vg_root rwi-aom-  1.50g                             100.00         lv_home_rimage_0:0-47 lv_home_rimage_1:0-47  
  [lv_home_rimage_0]  vg_root iwi-aor-  1.50g                                            /dev/sda2:192-239                            
  [lv_home_rimage_1]  vg_root iwi-aor-  1.50g                                            /dev/sdc2:34-81                              
  [lv_home_rmeta_0]   vg_root ewi-aor- 32.00m                                            /dev/sda2:177-177                            
  [lv_home_rmeta_1]   vg_root ewi-aor- 32.00m                                            /dev/sdc2:33-33  

So everything looks perfect but if I perform the same dd write test, here is what I got:
[root@shi-rhel63 home]# iostat -x 1 sda sdc -m
Linux 2.6.32-279.el6.x86_64 (shi-rhel63) 06/11/13 _x86_64_ (1 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.06    0.07    0.36    0.42    0.00   99.10

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.05     7.70    0.39    0.21     0.00     0.03   109.24     0.11  174.10   6.77   0.40
sdc               0.02     0.00    0.01    0.00     0.00     0.00     8.30     0.00    3.07   2.85   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    5.15   94.85    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  6281.44    0.00   49.48     0.00    22.14   916.38   138.42 2932.19  20.83 103.09
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    5.10   94.90    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  6213.27    0.00   54.08     0.00    23.99   908.30   136.34 2812.40  18.87 102.04
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    6.06   93.94    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  5141.41    0.00   50.51     0.00    22.23   901.36   133.47 2612.74  20.00 101.01
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

So it is clear that the newly added mirror is not being written at all.

What is really interesting is that if I reboot, it will work properly. But is there a way not to reboot?

Thanks a lot,
Shi
PS. My OS info

[root@shi-rhel63 home]# uname -a
Linux shi-rhel63 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
[root@shi-rhel63 home]# lvm version
  LVM version:     2.02.95(2)-RHEL6 (2012-05-16)
  Library version: 1.02.74-RHEL6 (2012-05-16)
  Driver version:  4.22.6

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

[Index of Archives]     [Gluster Users]     [Kernel Development]     [Linux Clusters]     [Device Mapper]     [Security]     [Bugtraq]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]

  Powered by Linux