The rgmanager service is not necessary if the cluster has no resources to manage....further more info on cluster status is needed like #clustat If it says all the nodes are online then more debug logs will be needed to find out the problem. --Sunil -----Original Message----- From: linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Balaji Sundar Sent: Monday, March 07, 2011 2:04 PM To: linux-cluster@xxxxxxxxxx Subject: rgmanager not running Dear All, I have using RHEL6 Linux and Kernel Version is 2.6.32-71.el6.i686 I have configured Cluster Suite with 2 servers Server 1 : 192.168.13.131 IP Address and hostname is primary Server 2 : 192.168.13.132 IP Address and hostname is secondary Floating : 192.168.13.133 IP Address (Assumed by currently active server) I have verified that service cman is running and cluster.conf is valid using ccs_config_validate command Finally i found that rgmanager is not running and services are not started [root@primary cluster]# service rgmanager status rgmanager dead but pid file exists [root@primary cluster]# [root@primary cluster]# cman_tool services [root@primary cluster]# [root@primary cluster]# cman_tool status Version: 6.2.0 Config Version: 1 Cluster Name: EMSCluster Cluster Id: 808 Cluster Member: Yes Cluster Generation: 96 Membership state: Cluster-Member Nodes: 1 Expected votes: 1 Total votes: 1 Node votes: 1 Quorum: 1 Active subsystems: 7 Flags: 2node Ports Bound: 0 Node name: primary Node ID: 1 Multicast addresses: 239.192.3.43 Node addresses: 192.168.13.131 [root@primary cluster]# Found some error messages in "/var/log/messages" file Mar 7 14:39:42 primary corosync[7155]: [CMAN ] quorum regained, resuming activity Mar 7 14:39:42 primary corosync[7155]: [QUORUM] This node is within the primary component and will provide service. Mar 7 14:39:42 primary corosync[7155]: [QUORUM] Members[1]: 1 Mar 7 14:39:42 primary corosync[7155]: [QUORUM] Members[1]: 1 Mar 7 14:39:42 primary corosync[7155]: [CPG ] downlist received left_list: 0 Mar 7 14:39:42 primary corosync[7155]: [CPG ] chosen downlist from node r(0) ip(192.168.13.131) Mar 7 14:39:42 primary corosync[7155]: [MAIN ] Completed service synchronization, ready to provide service. Mar 7 14:39:44 primary fenced[7210]: fenced 3.0.12 started Mar 7 14:39:45 primary dlm_controld[7224]: dlm_controld 3.0.12 started Mar 7 14:39:45 primary gfs_controld[7254]: gfs_controld 3.0.12 started Mar 7 14:39:45 primary kernel: dlm: Using TCP for communications Mar 7 14:39:45 primary dlm_controld[7224]: dlm_join_lockspace no fence domain Mar 7 14:39:45 primary dlm_controld[7224]: process_uevent online@ error -1 errno 2 Mar 7 14:39:45 primary kernel: dlm: rgmanager: group join failed -1 -1 Found some error messages in "/var/log/cluster/dlm_controld.log" file Mar 07 14:39:45 dlm_controld dlm_controld 3.0.12 started Mar 07 14:39:45 dlm_controld dlm_join_lockspace no fence domain Mar 07 14:39:45 dlm_controld process_uevent online@ error -1 errno 2 I don't know what is the problem and Can some one throw light on this peculiar problem Thanks in Advance --Regards S.Balaji -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster