I'm running Centos 5.2 and using the the cluster suite + GFS1. I have an
EMC CX600 providing shared storage to some LUNs. Im using broacde port
fencing.
I'm experiencing a problem trying to add a previously removed node back
into the cluster. The node was having hardare RAM issues so it was
removed from the cluster completely (i.e. removed from the cluster.conf
and removed from the storage zoning as well). I then added 3 more nodes
to the cluster. Now that the bad RAM has been identified and removed, I
wanted to add the node back in. I followed the instructions that I had
used on the previous 3 nodes (i.e. used system-config-cluster to
configure the node, save and propagate the cluster.conf, manually
propagate the cluster.conf to the newly added node, and then start up
cman and clvmd). However when I tried to start up cman with "service
cman start". The process hangs when actually starting up cman. I did
some digging and in the /var/log/messages of the node I'm attempting to
add, I get the following:
Jan 23 15:41:39 node004 ccsd[9342]: Initial status:: Inquorate
Jan 23 15:41:40 node004 ccsd[9342]: Cluster is not quorate. Refusing
connection.
Jan 23 15:41:40 node004 ccsd[9342]: Error while processing connect:
Connection refused
Jan 23 15:41:45 node004 ccsd[9342]: Cluster is not quorate. Refusing
connection.
Jan 23 15:41:45 node004 ccsd[9342]: Error while processing connect:
Connection refused
Jan 23 15:41:50 node004 ccsd[9342]: Cluster is not quorate. Refusing
connection.
Jan 23 15:41:50 node004 ccsd[9342]: Error while processing connect:
Connection refused
I suspect that this is at least part of the problem. However, I'm a bit
confused because the cluster its attempting to join is most definitely
quorate. At least according to clustat -f
Cluster Status for rsph_centos_5 @ Fri Jan 23 17:00:45 2009
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
head1.clus.sph.emory.edu 1
Online, Local
node002.clus.sph.emory.edu 2 Online
node003.clus.sph.emory.edu 3 Online
node004.clus.sph.emory.edu 4 Offline
node005.clus.sph.emory.edu 5 Online
node006.clus.sph.emory.edu 6 Online
node007.clus.sph.emory.edu 7 Online
I'm thinking that there is something subtlet that I am missing that I
can change to make this work. I really don't want to have to re-install
and reconfigure the machine to get this to work. That is something that
you do in the Windows world :-)
So here is my cluster.conf file. Passwords changed to protect the guilty.
<?xml version="2.0"?>
<cluster alias="rsph_centos_5" config_version="41" name="rsph_centos_5">
<fence_daemon clean_start="1" post_fail_delay="30"
post_join_delay="90"/>
<clusternodes>
<clusternode name="head1.clus.sph.emory.edu" nodeid="1"
votes="7">
<fence>
<method name="1">
<device
name="sanclusa1.sph.emory.edu" port="1"/>
<device
name="sanclusb1.sph.emory.edu" port="1"/>
</method>
</fence>
</clusternode>
<clusternode name="node002.clus.sph.emory.edu"
nodeid="2" votes="1">
<fence>
<method name="1">
<device
name="sanclusa1.sph.emory.edu" port="2"/>
<device
name="sanclusb1.sph.emory.edu" port="2"/>
</method>
</fence>
</clusternode>
<clusternode name="node003.clus.sph.emory.edu"
nodeid="3" votes="1">
<fence>
<method name="1">
<device
name="sanclusa1.sph.emory.edu" port="3"/>
<device
name="sanclusb1.sph.emory.edu" port="3"/>
</method>
</fence>
</clusternode>
<clusternode name="node005.clus.sph.emory.edu"
nodeid="5" votes="1">
<fence>
<method name="1">
<device
name="sanclusa1.sph.emory.edu" port="5"/>
<device
name="sanclusb1.sph.emory.edu" port="5"/>
</method>
</fence>
</clusternode>
<clusternode name="node006.clus.sph.emory.edu"
nodeid="6" votes="1">
<fence>
<method name="1">
<device
name="sanclusa1.sph.emory.edu" port="6"/>
<device
name="sanclusb1.sph.emory.edu" port="6"/>
</method>
</fence>
</clusternode>
<clusternode name="node007.clus.sph.emory.edu"
nodeid="7" votes="1">
<fence>
<method name="1">
<device
name="sanclusa1.sph.emory.edu" port="7"/>
<device
name="sanclusb1.sph.emory.edu" port="7"/>
</method>
</fence>
</clusternode>
<clusternode name="node004.clus.sph.emory.edu"
nodeid="4" votes="1">
<fence>
<method name="1">
<device
name="sanclusa1.sph.emory.edu" port="4"/>
<device
name="sanclusb1.sph.emory.edu" port="4"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman/>
<fencedevices>
<fencedevice agent="fence_brocade"
ipaddr="170.140.183.87" login="admin" name="sanclusa1.sph.emory.edu"
passwd="mypasshere"/>
<fencedevice agent="fence_brocade"
ipaddr="170.140.183.88" login="admin" name="sanclusb1.sph.emory.edu"
passwd="mypasshere"/>
</fencedevices>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster