rgmanager or clustat problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 
I am running a four node GFS cluster with about 20 services per node.  All four nodes belong to the same failover domain, and they each have a priority of 1.  My shared storage is an iSCSI SAN.
 
After rgmanager has been running for a couple of days, clustat produces the following result on all four nodes:

Timed out waiting for a response from Resource Group Manager
Member Status: Quorate

  Member Name                              Status
  ------ ----                              ------
  node01           Online, rgmanager
  node02           Online, Local, rgmanager
  node03           Online, rgmanager
  node04           Online, rgmanager

I also get a time out when I try to determine the status of a particular service with "clustat -s servicename".

All of the services seem to be up and running, but clustat does not work.  Is there something wrong?  Is there a way for me to increase the time out?

clurgmgrd and dlm_recvd seem to be using a lot of CPU cycles on Node02, 40 and 60 percent, respectively. 

Thank you for your help.

cman_tool services:

NODE01:

Service          Name                              GID LID State     Code
Fence Domain:    "default"                           4   2 run       -
[1 3 2 4]

DLM Lock Space:  "clvmd"                             1   3 run       -
[1 3 2 4]

DLM Lock Space:  "Magma"                             3   5 run       -
[1 3 2 4]

DLM Lock Space:  "gfslv"                             5   6 run       -
[2 1 3 4]

GFS Mount Group: "gfslv"                             6   7 run       -
[2 1 3 4]

User:            "usrm::manager"                     2   4 run       -
[1 3 2 4]

NODE02:
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           4   5 run       -
[1 3 2 4]

DLM Lock Space:  "clvmd"                             1   1 run       -
[1 3 2 4]

DLM Lock Space:  "Magma"                             3   3 run       -
[1 3 2 4]

DLM Lock Space:  "gfslv"                             5   6 run       -
[1 4 2 3]

GFS Mount Group: "gfslv"                             6   7 run       -
[1 4 2 3]

User:            "usrm::manager"                     2   2 run       -
[1 3 2 4]

NODE03:
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           4   2 run       -
[1 2 3 4]

DLM Lock Space:  "clvmd"                             1   3 run       -
[1 2 3 4]

DLM Lock Space:  "Magma"                             3   5 run       -
[1 2 3 4]

DLM Lock Space:  "gfslv"                             5   6 run       -
[1 2 4 3]

GFS Mount Group: "gfslv"                             6   7 run       -
[1 2 4 3]

User:            "usrm::manager"                     2   4 run       -
[1 2 3 4]

NODE04:
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           4   2 run       -
[1 2 3 4]

DLM Lock Space:  "clvmd"                             1   3 run       -
[1 2 3 4]

DLM Lock Space:  "Magma"                             3   5 run       -
[1 2 3 4]

DLM Lock Space:  "gfslv"                             5   6 run       -
[1 4 2 3]

GFS Mount Group: "gfslv"                             6   7 run       -
[1 4 2 3]

User:            "usrm::manager"                     2   4 run       -
[1 2 3 4]

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux