How is your Cluster connections connected. (ie. Are you using a
hub,switch or direct connecting the heartbeat cables) ?
Dalton, Maurice wrote:
Still having the problem. I can't figure it out.
I just upgraded to the latest 5.1 cman.. No help.!!!!!!!!!
-----Original Message-----
From: linux-cluster-bounces@xxxxxxxxxx
[mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Bennie Thomas
Sent: Tuesday, March 25, 2008 10:57 AM
To: linux clustering
Subject: Re: 3 node cluster problems
Glad they are working. I have not used lvm with our Clusters. You know
have peaked
my curiosity and I will have to try building one. So were you also using
GFS ?
Dalton, Maurice wrote:
Sorry but security here will not allow me to send host files
BUT.
I was getting this in /var/log/messages on csarcsys3
Mar 25 15:26:11 csarcsys3-eth0 ccsd[7448]: Cluster is not quorate.
Refusing connection.
Mar 25 15:26:11 csarcsys3-eth0 ccsd[7448]: Error while processing
connect: Connection refused
Mar 25 15:26:12 csarcsys3-eth0 dlm_controld[7476]: connect to ccs
error
-111, check ccsd or cluster status
Mar 25 15:26:12 csarcsys3-eth0 ccsd[7448]: Cluster is not quorate.
Refusing connection.
Mar 25 15:26:12 csarcsys3-eth0 ccsd[7448]: Error while processing
connect: Connection refused
I had /dev/vg0/gfsvol on these systems.
I did a lvremove
Restarted cman on all systems and for some strange reason my clusters
are working.
It doesn't make any sense.
I can't thank you enough for your help.......!!!!!!
Thanks.
-----Original Message-----
From: linux-cluster-bounces@xxxxxxxxxx
[mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Bennie Thomas
Sent: Tuesday, March 25, 2008 10:27 AM
To: linux clustering
Subject: Re: 3 node cluster problems
I am currently running several 3-node cluster without a quorum disk.
However, If you want your cluster to run
if only one node is up then you will need a quorum disk. Can you send
your /etc/hosts file
for all systems, Also, could there be another node name called
csarcsys3-eth0 in your NIS or DNS
I configured some using Conga and some with system-config-cluster.
When
using the system-config-cluster
I basically run the config on all nodes; just adding the nodenames and
cluster name. I reboot all nodes
to make sure they see each other then go back and modify the config
files.
The file /var/log/messages should also shed some light on the problem.
Dalton, Maurice wrote:
Same problem.
I now have qdiskd running.
I have ran diff's on all three cluster.conf files.. all are the same
[root@csarcsys1-eth0 cluster]# more cluster.conf
<?xml version="1.0"?>
<cluster config_version="6" name="csarcsys5">
<fence_daemon post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="csarcsys1-eth0" nodeid="1" votes="1">
<fence/>
</clusternode>
<clusternode name="csarcsys2-eth0" nodeid="2" votes="1">
<fence/>
</clusternode>
<clusternode name="csarcsys3-eth0" nodeid="3" votes="1">
<fence/>
</clusternode>
</clusternodes>
<cman/>
<fencedevices/>
<rm>
<failoverdomains>
<failoverdomain name="csarcsysfo" ordered="0" restricted="1">
<failoverdomainnode name="csarcsys1-eth0" priority="1"/>
<failoverdomainnode name="csarcsys2-eth0" priority="1"/>
<failoverdomainnode name="csarcsys3-eth0" priority="1"/>
</failoverdomain>
</failoverdomains>
<resources>
<ip address="172.24.86.177" monitor_link="1"/>
<fs device="/dev/sdc1" force_fsck="0" force_unmount="1" fsid="57739"
fstype="ext3" mountpo
int="/csarc-test" name="csarcsys-fs" options="rw" self_fence="0"/>
</resources>
</rm>
<quorumd interval="4" label="csarcsysQ" min_score="1" tko="30"
votes="2"/>
</cluster>
More info from csarcsys3
[root@csarcsys3-eth0 cluster]# clustat
msg_open: No such file or directory
Member Status: Inquorate
Member Name ID Status
------ ---- ---- ------
csarcsys1-eth0 1 Offline
csarcsys2-eth0 2 Offline
csarcsys3-eth0 3 Online, Local
/dev/sdd1 0 Offline
[root@csarcsys3-eth0 cluster]# mkqdisk -L
mkqdisk v0.5.1
/dev/sdd1:
Magic: eb7a62c2
Label: csarcsysQ
Created: Wed Feb 13 13:44:35 2008
Host: csarcsys1-eth0.xxx.xxx.nasa.gov
[root@csarcsys3-eth0 cluster]# ls -l /dev/sdd1
brw-r----- 1 root disk 8, 49 Mar 25 14:09 /dev/sdd1
clustat from csarcsys1
msg_open: No such file or directory
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
csarcsys1-eth0 1 Online, Local
csarcsys2-eth0 2 Online
csarcsys3-eth0 3 Offline
/dev/sdd1 0 Offline, Quorum Disk
[root@csarcsys1-eth0 cluster]# ls -l /dev/sdd1
brw-r----- 1 root disk 8, 49 Mar 25 14:19 /dev/sdd1
mkqdisk v0.5.1
/dev/sdd1:
Magic: eb7a62c2
Label: csarcsysQ
Created: Wed Feb 13 13:44:35 2008
Host: csarcsys1-eth0.xxx.xxx.nasa.gov
Info from csarcsys2
root@csarcsys2-eth0 cluster]# clustat
msg_open: No such file or directory
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
csarcsys1-eth0 1 Offline
csarcsys2-eth0 2 Online, Local
csarcsys3-eth0 3 Offline
/dev/sdd1 0 Online, Quorum Disk
*From:* linux-cluster-bounces@xxxxxxxxxx
[mailto:linux-cluster-bounces@xxxxxxxxxx] *On Behalf Of *Panigrahi,
Santosh Kumar
*Sent:* Tuesday, March 25, 2008 7:33 AM
*To:* linux clustering
*Subject:* RE: 3 node cluster problems
If you are configuring your cluster by system-config-cluster then no
need to run ricci/luci. Ricci/luci needed for configuring the cluster
using conga. You can configure in either ways.
On seeing your clustat command outputs, it seems cluster is
partitioned (spilt brain) into 2 sub clusters [Sub1-*
**(csarcsys1-eth0, csarcsys2-eth0*) 2-* **csarcsys3-eth0*]. Without a
quorum device you can more often face this situation. To avoid this
you can configure a quorum device with a heuristic like ping message.
Use the link
(http://www.redhatmagazine.com/2007/12/19/enhancing-cluster-quorum-with-
qdisk/)
for configuring a quorum disk in RHCS.
Thanks,
S
-----Original Message-----
From: linux-cluster-bounces@xxxxxxxxxx
[mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Dalton,
Maurice
Sent: Tuesday, March 25, 2008 5:18 PM
To: linux clustering
Subject: RE: 3 node cluster problems
Still no change. Same as below.
I completely rebuilt the cluster using system-config-cluster
The Cluster software was installed from rhn, luci and ricci are
running.
This is the new config file and it has been copied to the 2 other
systems
[root@csarcsys1-eth0 cluster]# more cluster.conf
<?xml version="1.0"?>
<cluster config_version="5" name="csarcsys5">
<fence_daemon post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="csarcsys1-eth0" nodeid="1" votes="1">
<fence/>
</clusternode>
<clusternode name="csarcsys2-eth0" nodeid="2" votes="1">
<fence/>
</clusternode>
<clusternode name="csarcsys3-eth0" nodeid="3" votes="1">
<fence/>
</clusternode>
</clusternodes>
<cman/>
<fencedevices/>
<rm>
<failoverdomains>
<failoverdomain name="csarcsysfo" ordered="0"
restricted="1">
<failoverdomainnode
name="csarcsys1-eth0" priority="1"/>
<failoverdomainnode
name="csarcsys2-eth0" priority="1"/>
<failoverdomainnode
name="csarcsys3-eth0" priority="1"/>
</failoverdomain>
</failoverdomains>
<resources>
<ip address="172.xx.xx.xxx" monitor_link="1"/>
<fs device="/dev/sdc1" force_fsck="0"
force_unmount="1" fsid="57739" fstype="ext3" mountpo
int="/csarc-test" name="csarcsys-fs" options="rw" self_fence="0"/>
</resources>
</rm>
</cluster>
-----Original Message-----
From: linux-cluster-bounces@xxxxxxxxxx
[mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Bennie Thomas
Sent: Monday, March 24, 2008 4:17 PM
To: linux clustering
Subject: Re: 3 node cluster problems
Did you load the Cluster software via Conga or manually ? You would
have
had to load
luci on one node and ricci on all three.
Try copying the modified /etc/cluster/cluster.conf from csarcsys1 to
the
other two nodes.
Make sure you can ping the private interface to/from all nodes and
reboot. If this does not work
post your /etc/cluster/cluster.conf file again.
Dalton, Maurice wrote:
Yes
I also rebooted again just now to be sure.
-----Original Message-----
From: linux-cluster-bounces@xxxxxxxxxx
[mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Bennie Thomas
Sent: Monday, March 24, 2008 3:33 PM
To: linux clustering
Subject: Re: 3 node cluster problems
When you changed the nodenames in the /etc/lcuster/cluster.conf and
made
sure the /etc/hosts
file had the correct nodenames (Ie. 10.0.0.100 csarcsys1-eth0
csarcsys1-eth0.xxxx.xxxx.xxx.)
Did you reboot all the nodes at the sametime ?
Dalton, Maurice wrote:
No luck. It seems as if csarcsys3 thinks its in his own cluster
I renamed all config files and rebuilt from system-config-cluster
Clustat command from csarcsys3
[root@csarcsys3-eth0 cluster]# clustat
msg_open: No such file or directory
Member Status: Inquorate
Member Name ID Status
------ ---- ---- ------
csarcsys1-eth0 1 Offline
csarcsys2-eth0 2 Offline
csarcsys3-eth0 3 Online, Local
clustat command from csarcsys2
[root@csarcsys2-eth0 cluster]# clustat
msg_open: No such file or directory
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
csarcsys1-eth0 1 Online
csarcsys2-eth0 2 Online, Local
csarcsys3-eth0 3 Offline
-----Original Message-----
From: linux-cluster-bounces@xxxxxxxxxx
[mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Bennie
Thomas
Sent: Monday, March 24, 2008 2:25 PM
To: linux clustering
Subject: Re: 3 node cluster problems
You will also, need to make sure the clustered nodenames are in
your
/etc/hosts file.
Also, make sure your cluster network interface is up on all nodes
and
that the
/etc/cluster/cluster.conf are the same on all nodes.
Dalton, Maurice wrote:
The last post is incorrect.
Fence is still hanging at start up.
Here's another log message.
Mar 24 19:03:14 csarcsys3-eth0 ccsd[6425]: Error while processing
connect: Connection refused
Mar 24 19:03:15 csarcsys3-eth0 dlm_controld[6453]: connect to ccs
error -111, check ccsd or cluster status
*From:* linux-cluster-bounces@xxxxxxxxxx
[mailto:linux-cluster-bounces@xxxxxxxxxx] *On Behalf Of *Bennie
Thomas
*Sent:* Monday, March 24, 2008 11:22 AM
*To:* linux clustering
*Subject:* Re: 3 node cluster problems
try removing the fully qualified hostname from the cluster.conf
file.
Dalton, Maurice wrote:
I have NO fencing equipment
I have been task to setup a 3 node cluster
Currently I have having problems getting cman(fence) to start
Fence will try to start up during cman start up but will fail
I tried to run /sbin/fenced -D - I get the following
1206373475 cman_init error 0 111
Here's my cluster.conf file
<?xml version="1.0"?>
<cluster alias="csarcsys51" config_version="26" name="csarcsys51">
<fence_daemon clean_start="0" post_fail_delay="0"
post_join_delay="3"/>
<clusternodes>
<clusternode name="csarcsys1-eth0.xxx.xxxx.nasa.gov" nodeid="1"
votes="1">
<fence/>
</clusternode>
<clusternode name="csarcsys2-eth0.xxx.xxxx.nasa.gov" nodeid="2"
votes="1">
<fence/>
</clusternode>
<clusternode name="csarcsys3-eth0.xxx.xxxxnasa.gov" nodeid="3"
votes="1">
<fence/>
</clusternode>
</clusternodes>
<cman/>
<fencedevices/>
<rm>
<failoverdomains>
<failoverdomain name="csarcsys-fo" ordered="1" restricted="0">
<failoverdomainnode name="csarcsys1-eth0.xxx.xxxx.nasa.gov"
priority="1"/>
<failoverdomainnode name="csarcsys2-eth0.xxx.xxxx.nasa.gov"
priority="1"/>
<failoverdomainnode name="csarcsys2-eth0.xxx.xxxx.nasa.gov"
priority="1"/>
</failoverdomain>
</failoverdomains>
<resources>
<ip address="xxx.xxx.xxx.xxx" monitor_link="1"/>
<fs device="/dev/sdc1" force_fsck="0" force_unmount="1"
fsid="57739"
fstype="ext3" mountpo
int="/csarc-test" name="csarcsys-fs" options="rw" self_fence="0"/>
<nfsexport name="csarcsys-export"/>
<nfsclient name="csarcsys-nfs-client" options="no_root_squash,rw"
path="/csarc-test" targe
t="xxx.xxx.xxx.*"/>
</resources>
</rm>
</cluster>
Messages from the logs
ar 24 13:24:19 csarcsys2-eth0 ccsd[24888]: Cluster is not quorate.
Refusing connection.
Mar 24 13:24:19 csarcsys2-eth0 ccsd[24888]: Error while processing
connect: Connection refused
Mar 24 13:24:20 csarcsys2-eth0 ccsd[24888]: Cluster is not
quorate.
Refusing connection.
Mar 24 13:24:20 csarcsys2-eth0 ccsd[24888]: Error while processing
connect: Connection refused
Mar 24 13:24:21 csarcsys2-eth0 ccsd[24888]: Cluster is not
quorate.
Refusing connection.
Mar 24 13:24:21 csarcsys2-eth0 ccsd[24888]: Error while processing
connect: Connection refused
Mar 24 13:24:22 csarcsys2-eth0 ccsd[24888]: Cluster is not
quorate.
Refusing connection.
Mar 24 13:24:22 csarcsys2-eth0 ccsd[24888]: Error while processing
connect: Connection refused
Mar 24 13:24:23 csarcsys2-eth0 ccsd[24888]: Cluster is not
quorate.
Refusing connection.
Mar 24 13:24:23 csarcsys2-eth0 ccsd[24888]: Error while processing
connect: Connection refused
------------------------------------------------------------------------
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx <mailto:Linux-cluster@xxxxxxxxxx>
https://www.redhat.com/mailman/listinfo/linux-cluster
------------------------------------------------------------------------
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
------------------------------------------------------------------------
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster