Re: Linux-cluster Digest, Vol 57, Issue 5

"Jacques Duplessis" <duplessis.jacques@xxxxxxxxx> · Tue, 6 Jan 2009 18:56:56 -0500

# Add theses lines to syslog.conf file & Restart syslog
# ========================================================
# vi /etc/syslog.conf

# rgmanager log
  local4.*                     /var/log/rgmanager

# Create log file before restarting the syslog
# ========================================================
# touch /var/log/rgmanager
# chmod 644 /var/log/manager
# chown root.root /var/log/rgmanager

# service syslog restart
Shutting down kernel logger: [  OK  ]
Shutting down system logger: [  OK  ]
Starting system logger: [  OK  ]
Starting kernel logger: [  OK  ]

# Change cluster config file to log rgmanager info

# ========================================================

# vi /etc/cluster/cluster.conf

change line
<rm>
to 
<rm log_facility="local4" log_level="7">

# Push changes to all cluster nodes
# ========================================================

# ccs_tool update /etc/cluster/cluster.conf

Unplug and plug back network cable on the node and 
look at the /var/log/rgmanager file.

May contain usefull info for us.

On Tue, Jan 6, 2009 at 12:00 PM,  <linux-cluster-request@xxxxxxxxxx> wrote:

Send Linux-cluster mailing list submissions to

        linux-cluster@xxxxxxxxxx

To subscribe or unsubscribe via the World Wide Web, visit

        https://www.redhat.com/mailman/listinfo/linux-cluster

or, via email, send a message with subject or body 'help' to

        linux-cluster-request@xxxxxxxxxx

You can reach the person managing the list at

        linux-cluster-owner@xxxxxxxxxx

When replying, please edit your Subject line so it is more specific

than "Re: Contents of Linux-cluster digest..."

Today's Topics:

   1. Re: Re: Fencing test (Paras pradhan)

   2. problem adding new node to an existing cluster

      (Greenseid, Joseph M.)

   3. Re: problem adding new node to an existing cluster (Bob Peterson)

   4. RE: problem adding new node to an existing cluster

      (Greenseid, Joseph M.)

   5. RE: problem adding new node to an existing cluster

      (Greenseid, Joseph M.)

   6. RE: problem adding new node to an existing cluster

      (Greenseid, Joseph M.)

   7. Re: problem adding new node to an existing cluster (Bob Peterson)

   8. RE: problem adding new node to an existing cluster

      (Greenseid, Joseph M.)

----------------------------------------------------------------------

Message: 1

Date: Mon, 5 Jan 2009 12:11:24 -0600

From: "Paras pradhan" <pradhanparas@xxxxxxxxx>

Subject: Re:  Re: Fencing test

To: "linux clustering" <linux-cluster@xxxxxxxxxx>

Message-ID:

        <8b711df40901051011x79066243g38108439ffb1075f@xxxxxxxxxxxxxx>

Content-Type: text/plain; charset=ISO-8859-1

hi,

On Mon, Jan 5, 2009 at 8:23 AM, Rajagopal Swaminathan

<raju.rajsand@xxxxxxxxx> wrote:

> Greetings,

>

> On Sat, Jan 3, 2009 at 4:18 AM, Paras pradhan <pradhanparas@xxxxxxxxx> wrote:

>>

>> Here I am using 4 nodes.

>>

>> Node 1) That runs luci

>> Node 2) This is my iscsi shared storage where my virutal machine(s) resides

>> Node 3) First node in my two node cluster

>> Node 4) Second node in my two node cluster

>>

>> All of them are connected simply to an unmanaged 16 port switch.

>

> Luci need not require a separate node to run. it can run on one of the

> member nodes (node 3 | 4).

OK.

>

> what does clustat say?

Here is my clustat o/p:

-----------

[root@ha1lx ~]# clustat

Cluster Status for ipmicluster @ Mon Jan  5 12:00:10 2009

Member Status: Quorate

 Member Name                                                     ID   Status

 ------ ----                                                     ---- ------

 10.42.21.29                                                         1

Online, rgmanager

 10.42.21.27                                                         2

Online, Local, rgmanager

 Service Name

Owner (Last)                                                     State

 ------- ----

----- ------                                                     -----

 vm:linux64

10.42.21.27

started

[root@ha1lx ~]#

------------------------

10.42.21.27 is node3 and 10.42.21.29 is node4

>

> Can you post your cluster.conf here?

Here is my cluster.conf

--

[root@ha1lx cluster]# more cluster.conf

<?xml version="1.0"?>

<cluster alias="ipmicluster" config_version="8" name="ipmicluster">

        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>

        <clusternodes>

                <clusternode name="10.42.21.29" nodeid="1" votes="1">

                        <fence>

                                <method name="1">

                                        <device name="fence2"/>

                                </method>

                        </fence>

                </clusternode>

                <clusternode name="10.42.21.27" nodeid="2" votes="1">

                        <fence>

                                <method name="1">

                                        <device name="fence1"/>

                                </method>

                        </fence>

                </clusternode>

        </clusternodes>

        <cman expected_votes="1" two_node="1"/>

        <fencedevices>

                <fencedevice agent="fence_ipmilan" ipaddr="10.42.21.28"

login="admin" name="fence1" passwd="admin"/>

                <fencedevice agent="fence_ipmilan" ipaddr="10.42.21.30"

login="admin" name="fence2" passwd="admin"/>

        </fencedevices>

        <rm>

                <failoverdomains>

                        <failoverdomain name="myfd" nofailback="0" ordered="1" restricted="0">

                                <failoverdomainnode name="10.42.21.29" priority="2"/>

                                <failoverdomainnode name="10.42.21.27" priority="1"/>

                        </failoverdomain>

                </failoverdomains>

                <resources/>

                <vm autostart="1" domain="myfd" exclusive="0" migrate="live"

name="linux64" path="/guest_roots" recovery="restart"/>

        </rm>

</cluster>

------

Here:

10.42.21.28 is IPMI interface in node3

10.42.21.30 is IPMI interface in node4

>

> When you pull out the network cable *and* plug it back  in say node 3,

> , what messages appear in the /var/log/messages if Node 4 (if any)?

> (sorry for the repitition, but messages are necessary here to make any

> sense of the situation)

>

Ok here is the log in node 4 after i disconnect the network cable in node3.

-----------

Jan  5 12:05:24 ha2lx openais[4988]: [TOTEM] The token was lost in the

OPERATIONAL state.

Jan  5 12:05:24 ha2lx openais[4988]: [TOTEM] Receive multicast socket

recv buffer size (288000 bytes).

Jan  5 12:05:24 ha2lx openais[4988]: [TOTEM] Transmit multicast socket

send buffer size (262142 bytes).

Jan  5 12:05:24 ha2lx openais[4988]: [TOTEM] entering GATHER state from 2.

Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] entering GATHER state from 0.

Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] Creating commit token

because I am the rep.

Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] Saving state aru 76 high

seq received 76

Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] Storing new sequence id

for ring ac

Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] entering COMMIT state.

Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] entering RECOVERY state.

Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] position [0] member 10.42.21.29:

Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] previous ring seq 168 rep

10.42.21.27

Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] aru 76 high delivered 76

received flag 1

Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] Did not need to originate

any messages in recovery.

Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] Sending initial ORF token

Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] CLM CONFIGURATION CHANGE

Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] New Configuration:

Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ]    r(0) ip(10.42.21.29)

Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] Members Left:

Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ]    r(0) ip(10.42.21.27)

Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] Members Joined:

Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] CLM CONFIGURATION CHANGE

Jan  5 12:05:28 ha2lx kernel: dlm: closing connection to node 2

Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] New Configuration:

Jan  5 12:05:28 ha2lx fenced[5004]: 10.42.21.27 not a cluster member

after 0 sec post_fail_delay

Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ]    r(0) ip(10.42.21.29)

Jan  5 12:05:28 ha2lx kernel: GFS2: fsid=ipmicluster:guest_roots.0:

jid=1: Trying to acquire journal lock...

Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] Members Left:

Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] Members Joined:

Jan  5 12:05:28 ha2lx openais[4988]: [SYNC ] This node is within the

primary component and will provide service.

Jan  5 12:05:28 ha2lx openais[4988]: [TOTEM] entering OPERATIONAL state.

Jan  5 12:05:28 ha2lx openais[4988]: [CLM  ] got nodejoin message 10.42.21.29

Jan  5 12:05:28 ha2lx openais[4988]: [CPG  ] got joinlist message from node 1

Jan  5 12:05:28 ha2lx kernel: GFS2: fsid=ipmicluster:guest_roots.0:

jid=1: Looking at journal...