CONF:
<?xml version="1.0" ?>
<cluster alias="mrcluster" config_version="2" name="mrcluster">
<fence_daemon post_fail_delay="0" post_join_delay="30"/>
<clusternodes>
<clusternode name="clxmrcati12.xxxxxx.com" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="apcps05" option="off" port="3" switch="3"/>
<device name="apcps06" option="off" port="3" switch="3"/>
<device name="apcps05" option="on" port="3" switch="3"/>
<device name="apcps06" option="on" port="3" switch="3"/>
</method>
</fence>
</clusternode>
<clusternode name="clxmrcati11.xxxxxx.com" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="apcps05" option="off" port="4" switch="4"/>
<device name="apcps06" option="off" port="4" switch="4"/>
<device name="apcps05" option="on" port="4" switch="4"/>
<device name="apcps06" option="on" port="4" switch="4"/>
</method>
</fence>
</clusternode>
<clusternode name="clxmrweb20.xxxxxx.com" nodeid="3" votes="1">
<fence>
<method name="1">
<device name="apcps05" option="off" port="2" switch="2"/>
<device name="apcps06" option="off" port="2" switch="2"/>
<device name="apcps05" option="on" port="2" switch="2"/>
<device name="apcps06" option="on" port="2" switch="2"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman/>
<fencedevices>
<fencedevice agent="fence_apc" ipaddr="172.XX.XX.27" login="apc" name="apcps05" passwd="xxx"/>
<fencedevice agent="fence_apc" ipaddr="172.XX.XX..28" login="apc" name="apcps06" passwd="xxx"/>
</fencedevices>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>
-------------------------------------------------------------------------------------------
Host Files:
From Luci Node clxmrcati11:
127.0.0.1 localhost.localdomain localhost
172.XX.XX.18 clxmrcati11.xxxxxx.com clxmrcati11
172.XX.XX.19 clxmrcati12.xxxxxx.com clxmrcati12
172.XX.XX.20 clxmrrpt10.xxxxxx.com clxmrrpt10
172.XX.XX.21 clxmrweb20.xxxxxx.com clxmrweb20
From ricci node clxmrcati12:
127.0.0.1 localhost.localdomain localhost
172.XX.XX.19 clxmrcati12.maritz.com fenclxmrcati12
172.XX.XX.21 clxmrweb20.maritz.com I decided this morning to start checking packages/versions first. Here are some details about the system thus far:
CONF:
<?xml version="1.0" ?>
<cluster alias="mrcluster" config_version="2" name="mrcluster">
<fence_daemon post_fail_delay="0" post_join_delay="30"/>
<clusternodes>
<clusternode name="clxmrcati12.xxxxxx.com" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="apcps05" option="off" port="3" switch="3"/>
<device name="apcps06" option="off" port="3" switch="3"/>
<device name="apcps05" option="on" port="3" switch="3"/>
<device name="apcps06" option="on" port="3" switch="3"/>
</method>
</fence>
</clusternode>
<clusternode name="clxmrcati11.xxxxxx.com" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="apcps05" option="off" port="4" switch="4"/>
<device name="apcps06" option="off" port="4" switch="4"/>
<device name="apcps05" option="on" port="4" switch="4"/>
<device name="apcps06" option="on" port="4" switch="4"/>
</method>
</fence>
</clusternode>
<clusternode name="clxmrweb20.xxxxxx.com" nodeid="3" votes="1">
<fence>
<method name="1">
<device name="apcps05" option="off" port="2" switch="2"/>
<device name="apcps06" option="off" port="2" switch="2"/>
<device name="apcps05" option="on" port="2" switch="2"/>
<device name="apcps06" option="on" port="2" switch="2"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman/>
<fencedevices>
<fencedevice agent="fence_apc" ipaddr="172.XX.XX.27" login="apc" name="apcps05" passwd="xxx"/>
<fencedevice agent="fence_apc" ipaddr="172.XX.XX..28" login="apc" name="apcps06" passwd="xxx"/>
</fencedevices>
<rm>
<failoverdomains/>
<resources/>
</rm>
</cluster>
-------------------------------------------------------------------------------------------
Host Files:
>From Luci Node clxmrcati11:
127.0.0.1 localhost.localdomain localhost
172.XX.XX.18 clxmrcati11.xxxxxx.com clxmrcati11
172.XX.XX.19 clxmrcati12.xxxxxx.com clxmrcati12
172.XX.XX.20 clxmrrpt10.xxxxxx.com clxmrrpt10
172.XX.XX.21 clxmrweb20.xxxxxx.com clxmrweb20
>From ricci node clxmrcati12:
127.0.0.1 localhost.localdomain localhost
172.XX.XX.19 clxmrcati12.xxxxxx.com clxmrcati12
172.XX.XX.21 clxmrweb20.xxxxxx.com clxmrweb20
172.XX.XX.20 clxmrrpt10.xxxxxx.com clxmrrpt10
172.XX.XX.18 clxmrcati11.xxxxxx.com clxmrcati11
From ricci node clxmrweb20:
127.0.0.1 localhost.localdomain localhost
172.XX.XX.21 clxmrweb20.xxxxxx.com clxmrweb20
172.XX.XX.20 clxmrrpt10.xxxxxx.com clxmrrpt10
172.XX.XX.18 clxmrcati11.xxxxxx.com clxmrcati11
172.XX.XX.19 clxmrcati12.xxxxxx.com clxmrcati12
Mostly this in /var/log/messages:
Aug 25 09:36:12 fenclxmrcati11 dlm_controld[2267]: connect to ccs error -111, check ccsd or cluster status
Aug 25 09:36:12 fenclxmrcati11 ccsd[3758]: Cluster is not quorate. Refusing connection.
Aug 25 09:36:12 fenclxmrcati11 ccsd[3758]: Error while processing connect: Connection refused
Aug 25 09:36:12 fenclxmrcati11 gfs_controld[2273]: connect to ccs error -111, check ccsd or cluster status
Aug 25 09:36:12 fenclxmrcati11 ccsd[3758]: Cluster is not quorate. Refusing connection.
Aug 25 09:36:12 fenclxmrcati11 ccsd[3758]: Error while processing connect: Connection refused
Aug 25 09:36:13 fenclxmrcati11 ccsd[3758]: Cluster is not quorate. Refusing connection.
Aug 25 09:36:13 fenclxmrcati11 ccsd[3758]: Error while processing connect: Connection refused
Aug 25 09:36:13 fenclxmrcati11 ccsd[3758]: Cluster is not quorate. Refusing connection.
Aug 25 09:36:13 fenclxmrcati11 ccsd[3758]: Error while processing connect: Connection refused
Aug 25 09:36:13 fenclxmrcati11 ccsd[3758]: Cluster is not quorate. Refusing connection.
Aug 25 09:36:13 fenclxmrcati11 ccsd[3758]: Error while processing connect: Connection refused
Aug 25 09:36:14 fenclxmrcati11 ccsd[3758]: Cluster is not quorate. Refusing connection.
Aug 25 09:36:14 fenclxmrcati11 ccsd[3758]: Error while processing connect: Connection refused
Aug 25 09:36:14 fenclxmrcati11 ccsd[3758]: Cluster is not quorate. Refusing connection.
Aug 25 09:36:14 fenclxmrcati11 ccsd[3758]: Error while processing connect: Connection re
On Thu, Aug 27, 2009 at 3:27 AM, Jakov Sosic <jakov.sosic@xxxxxxx> wrote:
fenclxmrweb20On Wed, 26 Aug 2009 18:36:26 -0500
> I have tried almost everything at this point to try and troubleshoot
> this further. I can't create new cluster with luci.
>
172.XX.XX.20 clxmrrpt10.maritz.com fenclxmrrpt10
172.XX.XX.18 clxmrcati11..com clxmrcati11
On Thu, Aug 27, 2009 at 3:27 AM, Jakov Sosic <jakov.sosic@xxxxxxx> wrote:
On Wed, 26 Aug 2009 18:36:26 -0500
Looks like network issue to me.> I have tried almost everything at this point to try and troubleshoot
> this further. I can't create new cluster with luci.
>
> I broke and tried to reconfigure 3 node cluster at least 6 times.
>
> I have noticed nodes taking expectational long on initializing
> fencing upon cman start. I tried with defined and undefined fencing,
> the amount of time needed is still the same. Even after the fencing
> is overcome in /var/log/messages nodes refuse to join cluster due to
> the state of 'not in quorum' during joining process. I uped the
> post_join_delay as much as 150 but the result is the same.
>
> Fencing - I use APC PW Switches - I can login into apc PWS from the
> node, I can even fence the other node, but when cman is started it
> looks like it is almost timign out on staring fencing.
>
> If I issue cman_tool nodes it gives me the local node name as the
> member of the cluster and the other two with state 'X'. If I try
> cman_tool join clustername - it tells me the nodes are already in
> that cluster but cluster as the whole does not register. Each node
> thinks it's the only working member of the cluster.
>
>
> Any pointers?
Are you sure your network is operational in a sense of a multicast /
igmp? Try forcing igmp v1 in sysctl.conf - and if you have Cisco
equipment take a look at openais FAQ (mode sparse-dense).
--
| Jakov Sosic | ICQ: 28410271 | PGP: 0x965CAE2D |
=================================================================
| start fighting cancer -> http://www.worldcommunitygrid.org/ |
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
Alan A.
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster