Re: Totem Process pause detected

Steven Dake <steven.dake@xxxxxxxxx> · Sun, 27 Dec 2015 10:36:54 -0700

A process pause is a problem in the operating system.  Totem is not designed to handle long scheduling delays by the operating system.  346 seconds is a super long process pause.  My only recommendation is to shut down the cluster prior to a snapshot operation.  Since the system is out of service during the period anyway (because nothing is being scheduled by the operating system) there is no harm.  When I actively contributed to corosync development, I often ran totem with token of 200 msec.  I recommend lower token timer values.  The fact that hyper-v blocks all the operating system functionality for 350 seconds or longer seems like a very serious problem likely to blow up all kinds of fault detection timers in various software, not just Corosync.
Regards
-steve

Regards,
-steve

On Tue, Dec 22, 2015 at 8:32 AM, Fabio M. Di Nitto <fdinitto@xxxxxxxxxx> wrote:

On 12/21/2015 10:32 PM, Ludovic Zammit wrote:

> Hello,

>

> I'm running a centos 6.7 cluster of 2 nodes on a Hyper-V hypervisor.

> Every day at 11PM a snapshot job save both servers.

> The snapshotting process seems to cause a loss of connectivity between

> the two nodes which results in the cluster partitioning and pacemaker to

> start services on both nodes.

> Then once the snapshotting is done, the two halves of the cluster are

> able to see each other again and pacemaker chooses one on which to run

> the services.

> Unfortunately that means that our DRBD partition has been mounted on

> both, so it now goes into «  split brain mode » .

>

>

> When I was running corosync 1.4, I used to adjust the « token » variable

> in the configuration file so that both nodes would wait longer before

> detecting a loss of the other.

>

> Now that I have upgraded to corosync 2 (2.3.5 to be more precise) the

> problem is back with a vengeance.

>

> I have tried the configuration below, with a a very high totem value,

> and that resulted in the following errors (I have since reverted that

> change):

bad idea to increase totem timeout very high. It means that any fault

detection between nodes will take forever.

>

> Dec 21 08:59:13 [16696] node1 corosync notice  [TOTEM ] totemsrp.c:783

> Process pause detected for 3464149 ms, flush

> ing membership messages.

> Dec 21 08:59:13 [16696] node1 corosync notice  [TOTEM ] totemsrp.c:783

> Process pause detected for 3464149 ms, flush

> ing membership messages.

> Dec 21 08:59:13 [16696] node1 corosync notice  [TOTEM ] totemsrp.c:783

> Process pause detected for 3464199 ms, flush

> ing membership messages.

>

>

> What can I do to prevent the cluster splitting apart during those

> nightly snapshots?

Either use another backup method, or you need to stop the cluster on the

VM you are about to snapshot, take the snapshot, start the cluster

again, move to the next.

> How do I manually set a long totem timeout without breaking everything else?

>

The problem has nothing to do with just totem timeout, the problem is

that the VM was frozen for at least ´3464199 ms´ without being scheduled

by the hypervisor. So even a very high token timeout, would not solve

the problem of services running on that specific VM NOT being available

during the snapshot.

Fabio

>

>

>

> ======================================================================

>

> Software version:

> 2.6.32-573.7.1.el6.x86_64

>

> corosync-2.3.5-1.el6.x86_64

> corosynclib-2.3.5-1.el6.x86_64

>

> pacemaker-cluster-libs-1.1.13-1.el6.x86_64

> pacemaker-cli-1.1.13-1.el6.x86_64

>

> kmod-microsoft-hyper-v-4.0.11-20150728.x86_64

> microsoft-hyper-v-4.0.11-20150728.x86_64

>

> Configuration:

>

> totem {

>     version: 2

>

>     crypto_cipher: none

>     crypto_hash: none

>     clear_node_high_bit: yes

>     cluster_name: cluster

>     transport: udpu

>     token: 150000

>

>     interface {

>         ringnumber: 0

>         bindnetaddr: 10.200.0.2

>         mcastport: 5405

>         ttl: 1

>     }

> }

>

> nodelist {

>     node {

>         ring0_addr:  10.200.0.2

>     }

>

>     node {

>         ring0_addr:  10.200.0.3

>     }

> }

>

> logging {

>     fileline: on

>     to_stderr: no

>     to_logfile: yes

>     logfile: /var/log/cluster/corosync.log

>     to_syslog: yes

>     debug: off

>     timestamp: on

>     logger_subsys {

>         subsys: QUORUM

>         debug: off

>     }

> }

>

>

> quorum {

>     provider: corosync_votequorum

>     two_node: 1

> }

>

>

>

> Thank you for your help,

> —

>

> Ludovic Zammit

>

>

>

>

>

>

>

>

> _______________________________________________

> discuss mailing list

> discuss@xxxxxxxxxxxx

> http://lists.corosync.org/mailman/listinfo/discuss

>

_______________________________________________

discuss mailing list

discuss@xxxxxxxxxxxx

http://lists.corosync.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss