Rhel 5.7 Cluster - gfs2 volume in "LEAVE_START_WAIT" status

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Fellow Cluster Compatriots,
I'm looking for some guidance here. Whenever my rhel 5.7 cluster get's into "LEAVE_START_WAIT" on on a given iscsi volume, the following occurs:
  1. I can't r/w io to the volume.
  2. Can't unmount it, from any node.
  3. In flight/pending IO's are impossible to determine or kill since lsof on the mount fails. Basically all IO operations stall/fail.

So my questions are:

  1. What does the output from group_tool -v really indicate, "00030005 LEAVE_START_WAIT 12 c000b0002 1" ? Man on group_tool doesn't list these fields.
  2. Does anyone have a list of what these fields represent ?
  3. Corrective actions. How do i get out of this state without rebooting the entire cluster ?
  4. Is it possible to determine the offending node ?
thanks,
-Cedric


//misc output

root@bl13-node13:~# clustat
Cluster Status for cluster3 @ Sat Jun  2 20:47:08 2012
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
bl01-node01                                      1 Online, rgmanager
 bl04-node04                                      4 Online, rgmanager
 bl05-node05                                      5 Online, rgmanager
 bl06-node06                                      6 Online, rgmanager
 bl07-node07                                      7 Online, rgmanager
 bl08-node08                                      8 Online, rgmanager
 bl09-node09                                      9 Online, rgmanager
 bl10-node10                                     10 Online, rgmanager
 bl11-node11                                     11 Online, rgmanager
 bl12-node12                                     12 Online, rgmanager
 bl13-node13                                     13 Online, Local, rgmanager
 bl14-node14                                     14 Online, rgmanager
 bl15-node15                                     15 Online, rgmanager


 Service Name                                                 Owner (Last)                                                 State       
 ------- ----                                                 ----- ------                                                 -----       
 service:httpd                                                bl05-node05                               started     
 service:nfs_disk2                                         bl08-node08                               started


root@bl13-node13:~# group_tool -v
type             level name            id       state node id local_done
fence            0     default         0001000d none       
[1 4 5 6 7 8 9 10 11 12 13 14 15]
dlm              1     clvmd           0001000c none       
[1 4 5 6 7 8 9 10 11 12 13 14 15]
dlm              1     cluster3_disk1  00020005 none       
[4 5 6 7 8 9 10 11 12 13 14 15]
dlm              1     cluster3_disk2  00040005 none       
[4 5 6 7 8 9 10 11 13 14 15]
dlm              1     cluster3_disk7  00060005 none       
[1 4 5 6 7 8 9 10 11 12 13 14 15]
dlm              1     cluster3_disk8  00080005 none       
[1 4 5 6 7 8 9 10 11 12 13 14 15]
dlm              1     cluster3_disk9  000a0005 none       
[1 4 5 6 7 8 9 10 11 12 13 14 15]
dlm              1     disk10          000c0005 none       
[1 4 5 6 7 8 9 10 11 12 13 14 15]
dlm              1     rgmanager       0001000a none       
[1 4 5 6 7 8 9 10 11 12 13 14 15]
dlm              1     cluster3_disk3  00020001 none       
[1 5 6 7 8 9 10 11 12 13]
dlm              1     cluster3_disk6  00020008 none       
[1 4 5 6 7 8 9 10 11 12 13 14 15]
gfs              2     cluster3_disk1  00010005 none       
[4 5 6 7 8 9 10 11 12 13 14 15]
gfs              2     cluster3_disk2  00030005 LEAVE_START_WAIT 12 c000b0002 1
[4 5 6 7 8 9 10 11 13 14 15]

gfs              2     cluster3_disk7  00050005 none       
[1 4 5 6 7 8 9 10 11 12 13 14 15]
gfs              2     cluster3_disk8  00070005 none       
[1 4 5 6 7 8 9 10 11 12 13 14 15]
gfs              2     cluster3_disk9  00090005 none       
[1 4 5 6 7 8 9 10 11 12 13 14 15]
gfs              2     disk10          000b0005 none       
[1 4 5 6 7 8 9 10 11 12 13 14 15]
gfs              2     cluster3_disk3  00010001 none       
[1 5 6 7 8 9 10 11 12 13]
gfs              2     cluster3_disk6  00010008 none       
[1 4 5 6 7 8 9 10 11 12 13 14 15]

root@bl13-node13:~# gfs2_tool list
253:15 cluster3:cluster3_disk6
253:16 cluster3:cluster3_disk3
253:18 cluster3:disk10
253:17 cluster3:cluster3_disk9
253:19 cluster3:cluster3_disk8
253:21 cluster3:cluster3_disk7
253:22 cluster3:cluster3_disk2
253:23 cluster3:cluster3_disk1

root@bl13-node13:~# lvs
    Logging initialised at Sat Jun  2 20:50:03 2012
    Set umask from 0022 to 0077
    Finding all logical volumes
  LV                            VG                            Attr   LSize   Origin Snap%  Move Log Copy%  Convert
  lv_cluster3_Disk7             vg_Cluster3_Disk7             -wi-ao   3.00T                                     
  lv_cluster3_Disk9             vg_Cluster3_Disk9             -wi-ao 200.01G                                     
  lv_Cluster3_libvert           vg_Cluster3_libvert           -wi-a- 100.00G                                     
  lv_cluster3_disk1             vg_cluster3_disk1             -wi-ao 100.00G                                     
  lv_cluster3_disk10            vg_cluster3_disk10            -wi-ao  15.00T                                     
  lv_cluster3_disk2             vg_cluster3_disk2             -wi-ao 220.00G                                     
  lv_cluster3_disk3             vg_cluster3_disk3             -wi-ao 330.00G                                     
  lv_cluster3_disk4_1T-kvm-thin vg_cluster3_disk4_1T-kvm-thin -wi-a-   1.00T                                     
  lv_cluster3_disk5             vg_cluster3_disk5             -wi-a- 555.00G                                     
  lv_cluster3_disk6             vg_cluster3_disk6             -wi-ao   2.00T                                     
  lv_cluster3_disk8             vg_cluster3_disk8             -wi-ao   2.00T

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux