Re: Node is randomly fenced

"YB Tan Sri Dato Sri' Adli a.k.a Dell" <white.heron@xxxxxxxxx> · Wed, 18 Jun 2014 11:20:05 -0700

Hi,

The linux clustering will be only working perfectly if you run the linux operating systems between nodes. Allow root ssh persistent connection on top of same specifications hardware platform.

To perform test or proof of concept, you may allow to run and configure between two nodes.

The databases for clustering will be configure right after the two nodes linux operating systems run with persistent root access ssh connection.

Sent from Yahoo Mail for iPhone

                                From:

                            Schaefer, Micah <Micah.Schaefer@xxxxxxxxxx>;                            

                                To:

                            linux clustering <linux-cluster@xxxxxxxxxx>;                                                                             

                                Subject:

                            Re:  Node is randomly fenced                            

                                Sent:

                            Tue, Jun 17, 2014 2:27:29 PM                            

                                        I am running Red Hat 6.4 with the HA/ load balancing packages from the
install DVD. 

-bash-4.1$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.4 (Santiago)

-bash-4.1$ corosync -v
Corosync Cluster Engine, version '1.4.1'
Copyright (c) 2006-2009 Red Hat, Inc.

On 6/17/14, 8:41 AM, "Christine Caulfield" <ccaulfie@xxxxxxxxxx> wrote:

>On 12/06/14 20:06, Digimer wrote:
>> Hrm, I'm not really sure that I am able to interpret this without
 making
>> guesses. I'm cc'ing one of the devs (who I hope will poke the right
>> person if he's not able to help at the moment). Lets see what he has to
>> say.
>>
>> I am curious now, too. :)
>>
>> On 12/06/14 03:02 PM, Schaefer, Micah wrote:
>>> Node4 was fenced again, I was able to get some debug logs (below), a
>>>new
>>> message :
>>>
>>> "Jun 12 14:01:56 corosync [TOTEM ] The token was lost in the
>>>OPERATIONAL
>>> state.³
>>>
>>>
>>> Rest of corosync logs
>>>
>>> http://pastebin.com/iYFbkbhb
>>>
>>>
>>> Jun 12 14:44:49 corosync [TOTEM ] entering OPERATIONAL state.
>>> Jun 12 14:44:49 corosync [TOTEM ] A processor joined or left the
>>> membership and a new membership was formed.
>>> Jun 12 14:44:49 corosync [TOTEM ] waiting_trans_ack changed to 0
>>> Jun 12 14:44:49 corosync [TOTEM ] Process pause detected for 32947 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:49 corosync [TOTEM ] entering GATHER state from 12.
>>> Jun 12 14:44:49 corosync [TOTEM ] Process pause detected for 32947 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:49 corosync [TOTEM
 ] Process pause detected for 32947 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:49 corosync [TOTEM ] Process pause detected for 33016 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:49 corosync [TOTEM ] Process pause detected for 33016 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:49 corosync [TOTEM ] Process pause detected for 33016 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:49 corosync [TOTEM ] Process pause detected for 33016 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:49 corosync [TOTEM ] Process pause detected for 33086 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:49 corosync [TOTEM ] Process
 pause detected for 33086 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:49 corosync [TOTEM ] Process pause detected for 33086 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:49 corosync [TOTEM ] Process pause detected for 33086 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:49 corosync [TOTEM ] Process pause detected for 33155 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:49 corosync [TOTEM ] Process pause detected for 33155 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:49 corosync [TOTEM ] Process pause detected for 33155 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:49 corosync [TOTEM ] Process pause
 detected for 33155 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:50 corosync [TOTEM ] Process pause detected for 33224 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:50 corosync [TOTEM ] Process pause detected for 33224 ms,
>>> flushing membership messages.
>>> Jun 12 14:44:50 corosync [TOTEM ] Process pause detected for 33225 ms,
>>> flushing membership messages.
>
>
>I'm concerned that the pause messages are repeating like that, it looks
>like it might be a fixed bug. What version of corosync do you have?
>
>Chrissie
>
>-- 
>Linux-cluster mailing list
>Linux-cluster@xxxxxxxxxx
>https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster