I have two nodes on the same subnet, can ping each other,
are both alive, both are members of a two-node cluster. When I start cman
on both nodes at the same time it says “X not a cluster member after 60
sec post_join_delay”. The output of clustat shows that the other
node is “Offline” and the first node is “Online, local”.
The nodes are fencing each other and powering each other off. Please help determine why I cannot get these nodes to join.
Below is some information from my systems. RedHat Support is not getting
anywhere. Thanks ---------------------------------- Node1: bplmft11 Node2: bplmft12 uname –a -> Linux bplmft11 2.6.18-8.1.10.el5 #1 SMP
Thu Aug 30 20:43:28 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux [root@bplmft11 ~]# clustat msg_open: No such file or directory Member Status: Quorate Member
Name
ID Status ------
----
---- ------
bplmft12
1 Offline bplmft11 2
Online, Local /etc/cluster/cluster.conf file (with the fencing levels
removed): <?xml version="1.0" ?> <cluster alias="plm_test"
config_version="16" name="plm_test"> <fence_daemon
post_fail_delay="0" post_join_delay="60"/>
<clusternodes>
<clusternode name="bplmft12" nodeid="1"
votes="1">
<fence>
<method name="1"/>
</fence>
</clusternode>
<clusternode name="bplmft11" nodeid="2"
votes="1">
<fence> <method
name="1"/>
</fence>
</clusternode>
</clusternodes> <cman
expected_votes="1" two_node="1"/>
<fencedevices>
<fencedevice agent="fence_ilo" hostname="ilo-bplmft12"
login="redhat_cluster_user" name="ilo-bplmft12"
passwd="PASSWORD"/>
<fencedevice agent="fence_ilo" hostname="ilo-bplmft11"
login="redhat_cluster_user" name="ilo-bplmft11"
passwd="PASSWORD"/>
</fencedevices> <rm>
<failoverdomains/>
<resources/> </rm> </cluster> /var/log/messages after doing a “service cman start”
on both nodes: ep 26 13:33:04 bplmft11 ccsd[31407]: Cluster is not
quorate. Refusing connection. Sep 26 13:33:04 bplmft11 ccsd[31407]: Error while processing
connect: Connection refused Sep 26 13:33:04 bplmft11 ccsd[31407]: Initial status::
Quorate Sep 26 13:33:10 bplmft11 snmpd[2616]: Connection from UDP:
[127.0.0.1]:32771 Sep 26 13:33:10 bplmft11 snmpd[2616]: Received SNMP
packet(s) from UDP: [127.0.0.1]:32771 Sep 26 13:33:25 bplmft11 snmpd[2616]: Connection from UDP:
[127.0.0.1]:32771 Sep 26 13:33:55 bplmft11 last message repeated 2 times Sep 26 13:34:06 bplmft11 fenced[31436]: bplmft12 not a
cluster member after 60 sec post_join_delay Sep 26 13:34:06 bplmft11 fenced[31436]: fencing node
"bplmft12" Sep 26 13:34:06 bplmft11 fenced[31436]: fence "bplmft12"
failed This message is intended only for the individual or entity to which it is addressed and contains information that is proprietary to The Babcock & Wilcox Company and/or its affiliates, or may be otherwise confidential. If the reader of this message is not the intended recipient, or the employee agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return e-mail and delete this message from your computer. Thank you. |
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster