Vernard C. Martin wrote:
I'm running Centos 5.2 and using the the cluster suite + GFS1. I have
an EMC CX600 providing shared storage to some LUNs. Im using broacde
port fencing.
I'm experiencing a problem trying to add a previously removed node
back into the cluster. The node was having hardare RAM issues so it
was removed from the cluster completely (i.e. removed from the
cluster.conf and removed from the storage zoning as well). I then
added 3 more nodes to the cluster. Now that the bad RAM has been
identified and removed, I wanted to add the node back in. I followed
the instructions that I had used on the previous 3 nodes (i.e. used
system-config-cluster to configure the node, save and propagate the
cluster.conf, manually propagate the cluster.conf to the newly added
node, and then start up cman and clvmd). However when I tried to start
up cman with "service cman start". The process hangs when actually
starting up cman. I did some digging and in the /var/log/messages of
the node I'm attempting to add, I get the following:
Jan 23 15:41:39 node004 ccsd[9342]: Initial status:: Inquorate
Jan 23 15:41:40 node004 ccsd[9342]: Cluster is not quorate. Refusing
connection.
Jan 23 15:41:40 node004 ccsd[9342]: Error while processing connect:
Connection refused
Jan 23 15:41:45 node004 ccsd[9342]: Cluster is not quorate. Refusing
connection.
Jan 23 15:41:45 node004 ccsd[9342]: Error while processing connect:
Connection refused
Jan 23 15:41:50 node004 ccsd[9342]: Cluster is not quorate. Refusing
connection.
Jan 23 15:41:50 node004 ccsd[9342]: Error while processing connect:
Connection refused
I suspect that this is at least part of the problem. However, I'm a
bit confused because the cluster its attempting to join is most
definitely quorate. At least according to clustat -f
Cluster Status for rsph_centos_5 @ Fri Jan 23 17:00:45 2009
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
head1.clus.sph.emory.edu 1
Online, Local
node002.clus.sph.emory.edu 2 Online
node003.clus.sph.emory.edu 3 Online
node004.clus.sph.emory.edu 4
Offline
node005.clus.sph.emory.edu 5 Online
node006.clus.sph.emory.edu 6 Online
node007.clus.sph.emory.edu 7 Online
I'm thinking that there is something subtlet that I am missing that I
can change to make this work. I really don't want to have to
re-install and reconfigure the machine to get this to work. That is
something that you do in the Windows world :-)
So here is my cluster.conf file. Passwords changed to protect the guilty.
<?xml version="2.0"?>
<cluster alias="rsph_centos_5" config_version="41" name="rsph_centos_5">
<fence_daemon clean_start="1" post_fail_delay="30"
post_join_delay="90"/>
<clusternodes>
<clusternode name="head1.clus.sph.emory.edu" nodeid="1"
votes="7">
<fence>
<method name="1">
<device
name="sanclusa1.sph.emory.edu" port="1"/>
<device
name="sanclusb1.sph.emory.edu" port="1"/>
</method>
</fence>
</clusternode>
<clusternode name="node002.clus.sph.emory.edu"
nodeid="2" votes="1">
<fence>
<method name="1">
<device
name="sanclusa1.sph.emory.edu" port="2"/>
<device
name="sanclusb1.sph.emory.edu" port="2"/>
</method>
</fence>
</clusternode>
<clusternode name="node003.clus.sph.emory.edu"
nodeid="3" votes="1">
<fence>
<method name="1">
<device
name="sanclusa1.sph.emory.edu" port="3"/>
<device
name="sanclusb1.sph.emory.edu" port="3"/>
</method>
</fence>
</clusternode>
<clusternode name="node005.clus.sph.emory.edu"
nodeid="5" votes="1">
<fence>
<method name="1">
<device
name="sanclusa1.sph.emory.edu" port="5"/>
<device
name="sanclusb1.sph.emory.edu" port="5"/>
</method>
</fence>
</clusternode>
<clusternode name="node006.clus.sph.emory.edu"
nodeid="6" votes="1">
<fence>
<method name="1">
<device
name="sanclusa1.sph.emory.edu" port="6"/>
<device
name="sanclusb1.sph.emory.edu" port="6"/>
</method>
</fence>
</clusternode>
<clusternode name="node007.clus.sph.emory.edu"
nodeid="7" votes="1">
<fence>
<method name="1">
<device
name="sanclusa1.sph.emory.edu" port="7"/>
<device
name="sanclusb1.sph.emory.edu" port="7"/>
</method>
</fence>
</clusternode>
<clusternode name="node004.clus.sph.emory.edu"
nodeid="4" votes="1">
<fence>
<method name="1">
<device
name="sanclusa1.sph.emory.edu" port="4"/>
<device
name="sanclusb1.sph.emory.edu" port="4"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman/>
<fencedevices>
<fencedevice agent="fence_brocade"
ipaddr="170.140.183.87" login="admin" name="sanclusa1.sph.emory.edu"
passwd="mypasshere"/>
<fencedevice agent="fence_brocade"
ipaddr="170.140.183.88" login="admin" name="sanclusb1.sph.emory.edu"
passwd="mypasshere"/>
</fencedevices>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
You have a <cman/> to close the cman stanza in cluster.conf, but no
actual <cman parameter1=1 parameter2=2> to open it. Is this correct?
The cman stanza is where you would define expected_votes on the cluster,
so not having this present is perhaps the reason why ccsd believes the
cluster is inquorate?
Regards,
Stewart
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster