The pithy ruminations from frederic randriamora on Oct 29, 2010 4:30:03 pm entitled"RE: clustat stuck" were: ==> Hi, ==> ==> I have a 4 node cluster, with multipathed qdisk on a san. The nodes are ==> running redhat 5.4. I've got a 3 node cluster, with multipathed qdisk on a SAN. The nodes are running CentOS 5.5: Linux 2.6.18-194.32.1.el5 #1 SMP Wed Jan 5 17:52:25 EST 2011 x86_64 x86_64 x86_64 GNU/Linux lvm2-cluster-2.02.56-7.el5_5.4 cman-2.0.115-34.el5_5.4 rgmanager-2.0.52-6.el5.centos.8 openais-0.80.6-16.el5_5.9 ==> ==> After a minor change made in cluster.conf on node3 properly propagated ==> by ccs_tool update, clustat is no longer correctly responding in the ==> other 3 nodes. In my case, I failed a service from node3 ==> node2, but made no cluster configuration changes. ==> node3 is neither nodeid 1 nor qdisk master. ==> ==> clustat on node3 runs fine Similar. On node2, clustat works fine. ==> ==> clustat on the other nodes ==> ==> either hangs with ==> connect(8, {sa_family=AF_FILE, path="/var/run/cluster/rgmanager.sk"...}, 110 ==> from strace ==> ==> ==> or times out with ==> Timed out waiting for a response from Resource Group Manager ==> without displaying the still running services ==> Exactly the same behavior here. ==> cman_tool services et al. are just fine everywhere, ==> Agreed. The actual sevices are running on each node. The report from cman_tool is correct, but querying the cluster with "clustat" or operations with "cluscvadm" hang or timeout. ==> Although all the services are running fine, I cannot move/stop them ==> anymore with clusvcadm. ==> ==> How to get out of that situation? Is there any solution to this issue? Thanks, Mark -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster