Re: Trouble adding back in an old node

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Vernard C. Martin wrote:
I'm running Centos 5.2 and using the the cluster suite + GFS1. I have an EMC CX600 providing shared storage to some LUNs. Im using broacde port fencing.

I'm experiencing a problem trying to add a previously removed node back into the cluster. The node was having hardare RAM issues so it was removed from the cluster completely (i.e. removed from the cluster.conf and removed from the storage zoning as well). I then added 3 more nodes to the cluster. Now that the bad RAM has been identified and removed, I wanted to add the node back in. I followed the instructions that I had used on the previous 3 nodes (i.e. used system-config-cluster to configure the node, save and propagate the cluster.conf, manually propagate the cluster.conf to the newly added node, and then start up cman and clvmd). However when I tried to start up cman with "service cman start". The process hangs when actually starting up cman. I did some digging and in the /var/log/messages of the node I'm attempting to add, I get the following:

Jan 23 15:41:39 node004 ccsd[9342]: Initial status:: Inquorate
Jan 23 15:41:40 node004 ccsd[9342]: Cluster is not quorate. Refusing connection. Jan 23 15:41:40 node004 ccsd[9342]: Error while processing connect: Connection refused Jan 23 15:41:45 node004 ccsd[9342]: Cluster is not quorate. Refusing connection. Jan 23 15:41:45 node004 ccsd[9342]: Error while processing connect: Connection refused Jan 23 15:41:50 node004 ccsd[9342]: Cluster is not quorate. Refusing connection. Jan 23 15:41:50 node004 ccsd[9342]: Error while processing connect: Connection refused

I suspect that this is at least part of the problem. However, I'm a bit confused because the cluster its attempting to join is most definitely quorate. At least according to clustat -f

Cluster Status for rsph_centos_5 @ Fri Jan 23 17:00:45 2009
Member Status: Quorate

Member Name                                                  ID   Status
------ ----                                                  ---- ------
head1.clus.sph.emory.edu 1 Online, Local
node002.clus.sph.emory.edu                                       2 Online
node003.clus.sph.emory.edu                                       3 Online
node004.clus.sph.emory.edu 4 Offline
node005.clus.sph.emory.edu                                       5 Online
node006.clus.sph.emory.edu                                       6 Online
node007.clus.sph.emory.edu                                       7 Online


I'm thinking that there is something subtlet that I am missing that I can change to make this work. I really don't want to have to re-install and reconfigure the machine to get this to work. That is something that you do in the Windows world :-)


So here is my cluster.conf file. Passwords changed to protect the guilty.

<?xml version="2.0"?>
<cluster alias="rsph_centos_5" config_version="41" name="rsph_centos_5">
<fence_daemon clean_start="1" post_fail_delay="30" post_join_delay="90"/>
       <clusternodes>
<clusternode name="head1.clus.sph.emory.edu" nodeid="1" votes="7">
                       <fence>
                               <method name="1">
<device name="sanclusa1.sph.emory.edu" port="1"/> <device name="sanclusb1.sph.emory.edu" port="1"/>
                               </method>
                       </fence>
               </clusternode>
<clusternode name="node002.clus.sph.emory.edu" nodeid="2" votes="1">
                       <fence>
                               <method name="1">
<device name="sanclusa1.sph.emory.edu" port="2"/> <device name="sanclusb1.sph.emory.edu" port="2"/>
                               </method>
                       </fence>
               </clusternode>
<clusternode name="node003.clus.sph.emory.edu" nodeid="3" votes="1">
                       <fence>
                               <method name="1">
<device name="sanclusa1.sph.emory.edu" port="3"/> <device name="sanclusb1.sph.emory.edu" port="3"/>
                               </method>
                       </fence>
               </clusternode>
<clusternode name="node005.clus.sph.emory.edu" nodeid="5" votes="1">
                       <fence>
                               <method name="1">
<device name="sanclusa1.sph.emory.edu" port="5"/> <device name="sanclusb1.sph.emory.edu" port="5"/>
                               </method>
                       </fence>
               </clusternode>
<clusternode name="node006.clus.sph.emory.edu" nodeid="6" votes="1">
                       <fence>
                               <method name="1">
<device name="sanclusa1.sph.emory.edu" port="6"/> <device name="sanclusb1.sph.emory.edu" port="6"/>
                               </method>
                       </fence>
               </clusternode>
<clusternode name="node007.clus.sph.emory.edu" nodeid="7" votes="1">
                       <fence>
                               <method name="1">
<device name="sanclusa1.sph.emory.edu" port="7"/> <device name="sanclusb1.sph.emory.edu" port="7"/>
                               </method>
                       </fence>
               </clusternode>
<clusternode name="node004.clus.sph.emory.edu" nodeid="4" votes="1">
                       <fence>
                               <method name="1">
<device name="sanclusa1.sph.emory.edu" port="4"/> <device name="sanclusb1.sph.emory.edu" port="4"/>
                               </method>
                       </fence>
               </clusternode>
       </clusternodes>
       <cman/>
       <fencedevices>
<fencedevice agent="fence_brocade" ipaddr="170.140.183.87" login="admin" name="sanclusa1.sph.emory.edu" passwd="mypasshere"/> <fencedevice agent="fence_brocade" ipaddr="170.140.183.88" login="admin" name="sanclusb1.sph.emory.edu" passwd="mypasshere"/>
       </fencedevices>
       <rm>
               <failoverdomains/>
               <resources/>
       </rm>
</cluster>

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster


You have a <cman/> to close the cman stanza in cluster.conf, but no actual <cman parameter1=1 parameter2=2> to open it. Is this correct?

The cman stanza is where you would define expected_votes on the cluster, so not having this present is perhaps the reason why ccsd believes the cluster is inquorate?

Regards,

Stewart

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux