rgmanager not running

"Balaji Sundar" <balajisundar@xxxxxxxxxxxxx> · Mon, 7 Mar 2011 14:03:41 +0530 (IST)

Dear All,

I have using RHEL6 Linux and Kernel Version is 2.6.32-71.el6.i686

I have configured Cluster Suite with 2 servers
Server 1 : 192.168.13.131 IP Address and hostname is primary
Server 2 : 192.168.13.132 IP Address and hostname is secondary
Floating : 192.168.13.133 IP Address (Assumed by currently active server)

I have verified that service cman is running and cluster.conf is valid
using ccs_config_validate command

Finally i found that rgmanager is not running and services are not started
[root@primary cluster]# service rgmanager status
rgmanager dead but pid file exists
[root@primary cluster]#
[root@primary cluster]# cman_tool services
[root@primary cluster]#
[root@primary cluster]# cman_tool status
Version: 6.2.0
Config Version: 1
Cluster Name: EMSCluster
Cluster Id: 808
Cluster Member: Yes
Cluster Generation: 96
Membership state: Cluster-Member
Nodes: 1
Expected votes: 1
Total votes: 1
Node votes: 1
Quorum: 1
Active subsystems: 7
Flags: 2node
Ports Bound: 0
Node name: primary
Node ID: 1
Multicast addresses: 239.192.3.43
Node addresses: 192.168.13.131
[root@primary cluster]#

Found some error messages in "/var/log/messages" file
Mar  7 14:39:42 primary corosync[7155]:   [CMAN  ] quorum regained,
resuming activity
Mar  7 14:39:42 primary corosync[7155]:   [QUORUM] This node is within the
primary component and will provide service.
Mar  7 14:39:42 primary corosync[7155]:   [QUORUM] Members[1]: 1
Mar  7 14:39:42 primary corosync[7155]:   [QUORUM] Members[1]: 1
Mar  7 14:39:42 primary corosync[7155]:   [CPG   ] downlist received
left_list: 0
Mar  7 14:39:42 primary corosync[7155]:   [CPG   ] chosen downlist from
node r(0) ip(192.168.13.131)
Mar  7 14:39:42 primary corosync[7155]:   [MAIN  ] Completed service
synchronization, ready to provide service.
Mar  7 14:39:44 primary fenced[7210]: fenced 3.0.12 started
Mar  7 14:39:45 primary dlm_controld[7224]: dlm_controld 3.0.12 started
Mar  7 14:39:45 primary gfs_controld[7254]: gfs_controld 3.0.12 started
Mar  7 14:39:45 primary kernel: dlm: Using TCP for communications
Mar  7 14:39:45 primary dlm_controld[7224]: dlm_join_lockspace no fence
domain
Mar  7 14:39:45 primary dlm_controld[7224]: process_uevent online@ error
-1 errno 2
Mar  7 14:39:45 primary kernel: dlm: rgmanager: group join failed -1 -1

Found some error messages in "/var/log/cluster/dlm_controld.log" file
Mar 07 14:39:45 dlm_controld dlm_controld 3.0.12 started
Mar 07 14:39:45 dlm_controld dlm_join_lockspace no fence domain
Mar 07 14:39:45 dlm_controld process_uevent online@ error -1 errno 2

I don't know what is the problem and Can some one throw light on this
peculiar problem

Thanks in Advance

--Regards
S.Balaji

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster