Still having the problem. I can't figure it out. I just upgraded to the latest 5.1 cman.. No help.!!!!!!!!! -----Original Message----- From: linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Bennie Thomas Sent: Tuesday, March 25, 2008 10:57 AM To: linux clustering Subject: Re: 3 node cluster problems Glad they are working. I have not used lvm with our Clusters. You know have peaked my curiosity and I will have to try building one. So were you also using GFS ? Dalton, Maurice wrote: > Sorry but security here will not allow me to send host files > > BUT. > > > I was getting this in /var/log/messages on csarcsys3 > > Mar 25 15:26:11 csarcsys3-eth0 ccsd[7448]: Cluster is not quorate. > Refusing connection. > Mar 25 15:26:11 csarcsys3-eth0 ccsd[7448]: Error while processing > connect: Connection refused > Mar 25 15:26:12 csarcsys3-eth0 dlm_controld[7476]: connect to ccs error > -111, check ccsd or cluster status > Mar 25 15:26:12 csarcsys3-eth0 ccsd[7448]: Cluster is not quorate. > Refusing connection. > Mar 25 15:26:12 csarcsys3-eth0 ccsd[7448]: Error while processing > connect: Connection refused > > > I had /dev/vg0/gfsvol on these systems. > > I did a lvremove > > Restarted cman on all systems and for some strange reason my clusters > are working. > > It doesn't make any sense. > > I can't thank you enough for your help.......!!!!!! > > > Thanks. > > > -----Original Message----- > From: linux-cluster-bounces@xxxxxxxxxx > [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Bennie Thomas > Sent: Tuesday, March 25, 2008 10:27 AM > To: linux clustering > Subject: Re: 3 node cluster problems > > I am currently running several 3-node cluster without a quorum disk. > However, If you want your cluster to run > if only one node is up then you will need a quorum disk. Can you send > your /etc/hosts file > for all systems, Also, could there be another node name called > csarcsys3-eth0 in your NIS or DNS > > I configured some using Conga and some with system-config-cluster. When > using the system-config-cluster > I basically run the config on all nodes; just adding the nodenames and > cluster name. I reboot all nodes > to make sure they see each other then go back and modify the config > files. > > The file /var/log/messages should also shed some light on the problem. > > Dalton, Maurice wrote: > >> Same problem. >> >> I now have qdiskd running. >> >> I have ran diff's on all three cluster.conf files.. all are the same >> >> [root@csarcsys1-eth0 cluster]# more cluster.conf >> >> <?xml version="1.0"?> >> >> <cluster config_version="6" name="csarcsys5"> >> >> <fence_daemon post_fail_delay="0" post_join_delay="3"/> >> >> <clusternodes> >> >> <clusternode name="csarcsys1-eth0" nodeid="1" votes="1"> >> >> <fence/> >> >> </clusternode> >> >> <clusternode name="csarcsys2-eth0" nodeid="2" votes="1"> >> >> <fence/> >> >> </clusternode> >> >> <clusternode name="csarcsys3-eth0" nodeid="3" votes="1"> >> >> <fence/> >> >> </clusternode> >> >> </clusternodes> >> >> <cman/> >> >> <fencedevices/> >> >> <rm> >> >> <failoverdomains> >> >> <failoverdomain name="csarcsysfo" ordered="0" restricted="1"> >> >> <failoverdomainnode name="csarcsys1-eth0" priority="1"/> >> >> <failoverdomainnode name="csarcsys2-eth0" priority="1"/> >> >> <failoverdomainnode name="csarcsys3-eth0" priority="1"/> >> >> </failoverdomain> >> >> </failoverdomains> >> >> <resources> >> >> <ip address="172.24.86.177" monitor_link="1"/> >> >> <fs device="/dev/sdc1" force_fsck="0" force_unmount="1" fsid="57739" >> fstype="ext3" mountpo >> >> int="/csarc-test" name="csarcsys-fs" options="rw" self_fence="0"/> >> >> </resources> >> >> </rm> >> >> <quorumd interval="4" label="csarcsysQ" min_score="1" tko="30" >> > votes="2"/> > >> </cluster> >> >> More info from csarcsys3 >> >> [root@csarcsys3-eth0 cluster]# clustat >> >> msg_open: No such file or directory >> >> Member Status: Inquorate >> >> Member Name ID Status >> >> ------ ---- ---- ------ >> >> csarcsys1-eth0 1 Offline >> >> csarcsys2-eth0 2 Offline >> >> csarcsys3-eth0 3 Online, Local >> >> /dev/sdd1 0 Offline >> >> [root@csarcsys3-eth0 cluster]# mkqdisk -L >> >> mkqdisk v0.5.1 >> >> /dev/sdd1: >> >> Magic: eb7a62c2 >> >> Label: csarcsysQ >> >> Created: Wed Feb 13 13:44:35 2008 >> >> Host: csarcsys1-eth0.xxx.xxx.nasa.gov >> >> [root@csarcsys3-eth0 cluster]# ls -l /dev/sdd1 >> >> brw-r----- 1 root disk 8, 49 Mar 25 14:09 /dev/sdd1 >> >> clustat from csarcsys1 >> >> msg_open: No such file or directory >> >> Member Status: Quorate >> >> Member Name ID Status >> >> ------ ---- ---- ------ >> >> csarcsys1-eth0 1 Online, Local >> >> csarcsys2-eth0 2 Online >> >> csarcsys3-eth0 3 Offline >> >> /dev/sdd1 0 Offline, Quorum Disk >> >> [root@csarcsys1-eth0 cluster]# ls -l /dev/sdd1 >> >> brw-r----- 1 root disk 8, 49 Mar 25 14:19 /dev/sdd1 >> >> mkqdisk v0.5.1 >> >> /dev/sdd1: >> >> Magic: eb7a62c2 >> >> Label: csarcsysQ >> >> Created: Wed Feb 13 13:44:35 2008 >> >> Host: csarcsys1-eth0.xxx.xxx.nasa.gov >> >> Info from csarcsys2 >> >> root@csarcsys2-eth0 cluster]# clustat >> >> msg_open: No such file or directory >> >> Member Status: Quorate >> >> Member Name ID Status >> >> ------ ---- ---- ------ >> >> csarcsys1-eth0 1 Offline >> >> csarcsys2-eth0 2 Online, Local >> >> csarcsys3-eth0 3 Offline >> >> /dev/sdd1 0 Online, Quorum Disk >> >> *From:* linux-cluster-bounces@xxxxxxxxxx >> [mailto:linux-cluster-bounces@xxxxxxxxxx] *On Behalf Of *Panigrahi, >> Santosh Kumar >> *Sent:* Tuesday, March 25, 2008 7:33 AM >> *To:* linux clustering >> *Subject:* RE: 3 node cluster problems >> >> If you are configuring your cluster by system-config-cluster then no >> need to run ricci/luci. Ricci/luci needed for configuring the cluster >> using conga. You can configure in either ways. >> >> On seeing your clustat command outputs, it seems cluster is >> partitioned (spilt brain) into 2 sub clusters [Sub1-* >> **(csarcsys1-eth0, csarcsys2-eth0*) 2-* **csarcsys3-eth0*]. Without a >> quorum device you can more often face this situation. To avoid this >> you can configure a quorum device with a heuristic like ping message. >> Use the link >> >> > (http://www.redhatmagazine.com/2007/12/19/enhancing-cluster-quorum-with- > qdisk/) > >> for configuring a quorum disk in RHCS. >> >> Thanks, >> >> S >> >> -----Original Message----- >> From: linux-cluster-bounces@xxxxxxxxxx >> [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Dalton, Maurice >> Sent: Tuesday, March 25, 2008 5:18 PM >> To: linux clustering >> Subject: RE: 3 node cluster problems >> >> Still no change. Same as below. >> >> I completely rebuilt the cluster using system-config-cluster >> >> The Cluster software was installed from rhn, luci and ricci are >> > running. > >> This is the new config file and it has been copied to the 2 other >> >> systems >> >> [root@csarcsys1-eth0 cluster]# more cluster.conf >> >> <?xml version="1.0"?> >> >> <cluster config_version="5" name="csarcsys5"> >> >> <fence_daemon post_fail_delay="0" post_join_delay="3"/> >> >> <clusternodes> >> >> <clusternode name="csarcsys1-eth0" nodeid="1" votes="1"> >> >> <fence/> >> >> </clusternode> >> >> <clusternode name="csarcsys2-eth0" nodeid="2" votes="1"> >> >> <fence/> >> >> </clusternode> >> >> <clusternode name="csarcsys3-eth0" nodeid="3" votes="1"> >> >> <fence/> >> >> </clusternode> >> >> </clusternodes> >> >> <cman/> >> >> <fencedevices/> >> >> <rm> >> >> <failoverdomains> >> >> <failoverdomain name="csarcsysfo" ordered="0" >> >> restricted="1"> >> >> <failoverdomainnode >> >> name="csarcsys1-eth0" priority="1"/> >> >> <failoverdomainnode >> >> name="csarcsys2-eth0" priority="1"/> >> >> <failoverdomainnode >> >> name="csarcsys3-eth0" priority="1"/> >> >> </failoverdomain> >> >> </failoverdomains> >> >> <resources> >> >> <ip address="172.xx.xx.xxx" monitor_link="1"/> >> >> <fs device="/dev/sdc1" force_fsck="0" >> >> force_unmount="1" fsid="57739" fstype="ext3" mountpo >> >> int="/csarc-test" name="csarcsys-fs" options="rw" self_fence="0"/> >> >> </resources> >> >> </rm> >> >> </cluster> >> >> -----Original Message----- >> >> From: linux-cluster-bounces@xxxxxxxxxx >> >> [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Bennie Thomas >> >> Sent: Monday, March 24, 2008 4:17 PM >> >> To: linux clustering >> >> Subject: Re: 3 node cluster problems >> >> Did you load the Cluster software via Conga or manually ? You would >> > have > >> had to load >> >> luci on one node and ricci on all three. >> >> Try copying the modified /etc/cluster/cluster.conf from csarcsys1 to >> > the > >> other two nodes. >> >> Make sure you can ping the private interface to/from all nodes and >> >> reboot. If this does not work >> >> post your /etc/cluster/cluster.conf file again. >> >> Dalton, Maurice wrote: >> >> >>> Yes >>> >>> I also rebooted again just now to be sure. >>> >>> -----Original Message----- >>> >>> From: linux-cluster-bounces@xxxxxxxxxx >>> >>> [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Bennie Thomas >>> >>> Sent: Monday, March 24, 2008 3:33 PM >>> >>> To: linux clustering >>> >>> Subject: Re: 3 node cluster problems >>> >>> When you changed the nodenames in the /etc/lcuster/cluster.conf and >>> >> made >> >> >>> sure the /etc/hosts >>> >>> file had the correct nodenames (Ie. 10.0.0.100 csarcsys1-eth0 >>> >>> csarcsys1-eth0.xxxx.xxxx.xxx.) >>> >>> Did you reboot all the nodes at the sametime ? >>> >>> Dalton, Maurice wrote: >>> >>>> No luck. It seems as if csarcsys3 thinks its in his own cluster >>>> >>>> I renamed all config files and rebuilt from system-config-cluster >>>> >>>> Clustat command from csarcsys3 >>>> >>>> [root@csarcsys3-eth0 cluster]# clustat >>>> >>>> msg_open: No such file or directory >>>> >>>> Member Status: Inquorate >>>> >>>> Member Name ID Status >>>> >>>> ------ ---- ---- ------ >>>> >>>> csarcsys1-eth0 1 Offline >>>> >>>> csarcsys2-eth0 2 Offline >>>> >>>> csarcsys3-eth0 3 Online, Local >>>> >>>> clustat command from csarcsys2 >>>> >>>> [root@csarcsys2-eth0 cluster]# clustat >>>> >>>> msg_open: No such file or directory >>>> >>>> Member Status: Quorate >>>> >>>> Member Name ID Status >>>> >>>> ------ ---- ---- ------ >>>> >>>> csarcsys1-eth0 1 Online >>>> >>>> csarcsys2-eth0 2 Online, Local >>>> >>>> csarcsys3-eth0 3 Offline >>>> >>>> -----Original Message----- >>>> >>>> From: linux-cluster-bounces@xxxxxxxxxx >>>> >>>> [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Bennie >>>> > Thomas > >>>> Sent: Monday, March 24, 2008 2:25 PM >>>> >>>> To: linux clustering >>>> >>>> Subject: Re: 3 node cluster problems >>>> >>>> You will also, need to make sure the clustered nodenames are in >>>> > your > >>>> /etc/hosts file. >>>> >>>> Also, make sure your cluster network interface is up on all nodes >>>> > and > >>>> that the >>>> >>>> /etc/cluster/cluster.conf are the same on all nodes. >>>> >>>> Dalton, Maurice wrote: >>>> >>>>> The last post is incorrect. >>>>> >>>>> Fence is still hanging at start up. >>>>> >>>>> Here's another log message. >>>>> >>>>> Mar 24 19:03:14 csarcsys3-eth0 ccsd[6425]: Error while processing >>>>> >>>>> connect: Connection refused >>>>> >>>>> Mar 24 19:03:15 csarcsys3-eth0 dlm_controld[6453]: connect to ccs >>>>> >>>>> error -111, check ccsd or cluster status >>>>> >>>>> *From:* linux-cluster-bounces@xxxxxxxxxx >>>>> >>>>> [mailto:linux-cluster-bounces@xxxxxxxxxx] *On Behalf Of *Bennie >>>>> >>> Thomas >>> >>>>> *Sent:* Monday, March 24, 2008 11:22 AM >>>>> >>>>> *To:* linux clustering >>>>> >>>>> *Subject:* Re: 3 node cluster problems >>>>> >>>>> try removing the fully qualified hostname from the cluster.conf >>>>> >> file. >> >> >>>>> Dalton, Maurice wrote: >>>>> >>>>> I have NO fencing equipment >>>>> >>>>> I have been task to setup a 3 node cluster >>>>> >>>>> Currently I have having problems getting cman(fence) to start >>>>> >>>>> Fence will try to start up during cman start up but will fail >>>>> >>>>> I tried to run /sbin/fenced -D - I get the following >>>>> >>>>> 1206373475 cman_init error 0 111 >>>>> >>>>> Here's my cluster.conf file >>>>> >>>>> <?xml version="1.0"?> >>>>> >>>>> <cluster alias="csarcsys51" config_version="26" name="csarcsys51"> >>>>> >>>>> <fence_daemon clean_start="0" post_fail_delay="0" >>>>> >>>> post_join_delay="3"/> >>>> >>>>> <clusternodes> >>>>> >>>>> <clusternode name="csarcsys1-eth0.xxx.xxxx.nasa.gov" nodeid="1" >>>>> >>>> votes="1"> >>>> >>>>> <fence/> >>>>> >>>>> </clusternode> >>>>> >>>>> <clusternode name="csarcsys2-eth0.xxx.xxxx.nasa.gov" nodeid="2" >>>>> >>>> votes="1"> >>>> >>>>> <fence/> >>>>> >>>>> </clusternode> >>>>> >>>>> <clusternode name="csarcsys3-eth0.xxx.xxxxnasa.gov" nodeid="3" >>>>> >>>> votes="1"> >>>> >>>>> <fence/> >>>>> >>>>> </clusternode> >>>>> >>>>> </clusternodes> >>>>> >>>>> <cman/> >>>>> >>>>> <fencedevices/> >>>>> >>>>> <rm> >>>>> >>>>> <failoverdomains> >>>>> >>>>> <failoverdomain name="csarcsys-fo" ordered="1" restricted="0"> >>>>> >>>>> <failoverdomainnode name="csarcsys1-eth0.xxx.xxxx.nasa.gov" >>>>> >>>> priority="1"/> >>>> >>>>> <failoverdomainnode name="csarcsys2-eth0.xxx.xxxx.nasa.gov" >>>>> >>>> priority="1"/> >>>> >>>>> <failoverdomainnode name="csarcsys2-eth0.xxx.xxxx.nasa.gov" >>>>> >>>> priority="1"/> >>>> >>>>> </failoverdomain> >>>>> >>>>> </failoverdomains> >>>>> >>>>> <resources> >>>>> >>>>> <ip address="xxx.xxx.xxx.xxx" monitor_link="1"/> >>>>> >>>>> <fs device="/dev/sdc1" force_fsck="0" force_unmount="1" >>>>> > fsid="57739" > >>>>> fstype="ext3" mountpo >>>>> >>>>> int="/csarc-test" name="csarcsys-fs" options="rw" self_fence="0"/> >>>>> >>>>> <nfsexport name="csarcsys-export"/> >>>>> >>>>> <nfsclient name="csarcsys-nfs-client" options="no_root_squash,rw" >>>>> >>>>> path="/csarc-test" targe >>>>> >>>>> t="xxx.xxx.xxx.*"/> >>>>> >>>>> </resources> >>>>> >>>>> </rm> >>>>> >>>>> </cluster> >>>>> >>>>> Messages from the logs >>>>> >>>>> ar 24 13:24:19 csarcsys2-eth0 ccsd[24888]: Cluster is not quorate. >>>>> >>>>> Refusing connection. >>>>> >>>>> Mar 24 13:24:19 csarcsys2-eth0 ccsd[24888]: Error while processing >>>>> >>>>> connect: Connection refused >>>>> >>>>> Mar 24 13:24:20 csarcsys2-eth0 ccsd[24888]: Cluster is not >>>>> > quorate. > >>>>> Refusing connection. >>>>> >>>>> Mar 24 13:24:20 csarcsys2-eth0 ccsd[24888]: Error while processing >>>>> >>>>> connect: Connection refused >>>>> >>>>> Mar 24 13:24:21 csarcsys2-eth0 ccsd[24888]: Cluster is not >>>>> > quorate. > >>>>> Refusing connection. >>>>> >>>>> Mar 24 13:24:21 csarcsys2-eth0 ccsd[24888]: Error while processing >>>>> >>>>> connect: Connection refused >>>>> >>>>> Mar 24 13:24:22 csarcsys2-eth0 ccsd[24888]: Cluster is not >>>>> > quorate. > >>>>> Refusing connection. >>>>> >>>>> Mar 24 13:24:22 csarcsys2-eth0 ccsd[24888]: Error while processing >>>>> >>>>> connect: Connection refused >>>>> >>>>> Mar 24 13:24:23 csarcsys2-eth0 ccsd[24888]: Cluster is not >>>>> > quorate. > >>>>> Refusing connection. >>>>> >>>>> Mar 24 13:24:23 csarcsys2-eth0 ccsd[24888]: Error while processing >>>>> >>>>> connect: Connection refused >>>>> >> > ------------------------------------------------------------------------ > >>>>> -- >>>>> >>>>> Linux-cluster mailing list >>>>> >>>>> Linux-cluster@xxxxxxxxxx <mailto:Linux-cluster@xxxxxxxxxx> >>>>> >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >> > ------------------------------------------------------------------------ > >>>>> -- >>>>> >>>>> Linux-cluster mailing list >>>>> >>>>> Linux-cluster@xxxxxxxxxx >>>>> >>>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>>> >>>> -- >>>> >>>> Linux-cluster mailing list >>>> >>>> Linux-cluster@xxxxxxxxxx >>>> >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>>> -- >>>> >>>> Linux-cluster mailing list >>>> >>>> Linux-cluster@xxxxxxxxxx >>>> >>>> https://www.redhat.com/mailman/listinfo/linux-cluster >>>> >>> -- >>> >>> Linux-cluster mailing list >>> >>> Linux-cluster@xxxxxxxxxx >>> >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >>> -- >>> >>> Linux-cluster mailing list >>> >>> Linux-cluster@xxxxxxxxxx >>> >>> https://www.redhat.com/mailman/listinfo/linux-cluster >>> >> -- >> >> Linux-cluster mailing list >> >> Linux-cluster@xxxxxxxxxx >> >> https://www.redhat.com/mailman/listinfo/linux-cluster >> >> -- >> >> Linux-cluster mailing list >> >> Linux-cluster@xxxxxxxxxx >> >> https://www.redhat.com/mailman/listinfo/linux-cluster >> >> >> > ------------------------------------------------------------------------ > >> -- >> Linux-cluster mailing list >> Linux-cluster@xxxxxxxxxx >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster