Re: 2 node cluster showing strange behaviour

"Ben .T.George" <bentech4you@xxxxxxxxx> · Tue, 18 Sep 2012 06:17:24 +0300

Hi thanks for your reply

Beloe is my cluster.conffile

<?xml version="1.0"?>
<cluster config_version="7" name="eccprd">
        <clusternodes>

                <clusternode name="cgceccprd1.combinedgroup.net" nodeid="1">
                        <fence>
                                <method name="ucs-node1"/>

                        </fence>
                </clusternode>
                <clusternode name="cgceccprd2.combinedgroup.net" nodeid="2">

                        <fence>
                                <method name="ucs-node2"/>
                        </fence>
                </clusternode>
        </clusternodes>

        <cman expected_votes="1" two_node="1"/>
        <rm>
                <resources>
                        <ip address="172.22.10.230" sleeptime="10"/>

                </resources>
                <service exclusive="1" name="eccsapmnt" recovery="relocate">
                        <ip ref="172.22.10.230"/>

                </service>
        </rm>
        <fencedevices>
                <fencedevice agent="fence_cisco_ucs" ipaddr="172.22.90.61" login="admin" name="ucs-node1" passwd="duc2Cisco"/>

                <fencedevice agent="fence_cisco_ucs" ipaddr="172.22.90.59" login="admin" name="ucs-node2" passwd="duc2Cisco"/>
        </fencedevices>
</cluster>

when i try to start cluster on node1, i am geeting this message on mesages:

 tail -f -n 0 /var/log/messages
Sep 18 06:06:02 cgceccprd1 modcluster: Starting service: eccsapmnt on node 
Sep 18 06:06:08 cgceccprd1 modcluster: Starting service: eccsapmnt on node cgceccprd1.combinedgroup.net

but the service is not starting.on luci , it's showing both nodes are online.but on clustat different

main error getting on messages is 

Sep 18 03:35:48 cgceccprd1 fenced[8424]: fencing node cgceccprd2.combinedgroup.net still retrying

Sep 18 04:06:16 cgceccprd1 fenced[8424]: fencing node cgceccprd2.combinedgroup.net still retrying
Sep 18 04:36:45 cgceccprd1 fenced[8424]: fencing node cgceccprd2.combinedgroup.net still retrying

Sep 18 05:07:14 cgceccprd1 fenced[8424]: fencing node cgceccprd2.combinedgroup.net still retrying
Sep 18 05:37:42 cgceccprd1 fenced[8424]: fencing node cgceccprd2.combinedgroup.net still retrying

These messages from node1.i am geeting same message on node saying that

cgceccprd2 fenced[8424]: fencing node cgceccprd1.combinedgroup.net still retrying

i don't know what is problem here.

please help me solve
Regards,
Ben

On Tue, Sep 18, 2012 at 5:03 AM, Digimer <lists@xxxxxxxxxx> wrote:

On 09/17/2012 04:56 PM, Ben .T.George wrote:

HI

my cman_tool status is showing multicast IP address like below:

Multicast addresses : 239.192.140.34

i tried to ping on this IP.but it's not pining..i don't know more about

multicast configuration..

please help me to check multicast more.

Multicast IPs do not represent a specific machine, but a group. Think of multicast as a sort of "mailing list"; A machine "subscribes" to the multicast group and the switch then says "right, when a packet comes in addressed to the multicast group, forward a copy to all subscribed machines". With Cisco, you need to create persistent multicast groups in the switch, if I recall correctly (I don't use Cisco myself).

In your case though, you're showing "cgceccprd1.combinedgroup.net" as "Offline" from both nodes, so the cluster is simply not starting.

Can you paste your cluster.conf please? Also, can you run 'tail -f -n 0 /var/log/messages' in a terminal on cgceccprd1, try to start the cluster, wait for it to fail and then paste the output along with your configuration?

This might help (the "Overview" section at the start, if nothing else). There is a sample cluster.conf there.

https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial

hope that helps

-- 

Digimer

Papers and Projects: https://alteeve.ca

-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster