Re: Question: Duration of DC election

Andrew Beekhof <andrew@xxxxxxxxxxx> · Thu, 27 Aug 2015 13:06:29 +1000

> On 25 Aug 2015, at 7:46 pm, Stefan Wenk <Stefan.wenk@xxxxxxx> wrote:
> 
> Hi,
> 
> I'm performing downtime measurement tests using corosync version 2.3.0 and pacemaker version 1.1.12  under RHEL 6.5 MRG and although not recommended, I tuned the corosync configuration settings to following insane values:
> 
>        # Timeout for token
>        token: 60
>        token_retransmits_before_loss_const: 1
> 
>        # How long to wait for join messages in the membership protocol (ms)
>        join: 35
>        consensus: 70
> 
> My two node cluster consists of a kamailio clone resource, which replicates the so called userlocation state using DMQ on application level (see [1]). The switchover performs the migration of a ocf:heartbeat:IPaddr2 resource. With these settings, the service downtime is lower 100ms in case of a controlled cluster switchover, when "/etc/init.d/pacemaker stop" and "/etc/init.d/corosync stop" get executed. 
> 
> The service downtime is about 400ms when the power loss is simulated on the active node, which does not execute the DC task. When I simulate power loss on the active node, which is active and executes the DC task, the service downtime increases to about 1500ms. As the timestamps in the logs are on second resolution only, it is hard to provide more detailed numbers, but apparently the DC election procedure takes more than 1000ms.
> 
> Are there any possibilities to tune the DC election process? Is there documentation available what is happening in this situation?
> 
> Tests with more nodes in the cluster showed that the service downtime increases with the number of online cluster nodes, even if the DC is executed on one of the nodes, which remain active. 

When there is only 2 nodes, then there is effectively no election happening and the delay is made up of:
- corosync detection time
- time for the crmd to send a message to itself via corosync
- time for the policy engine to figure out where to put the service
- time for the start action of your service(s) to execute

There _should_ be an entry for “time to fence the peer”, but based on your reported times I’m assuming you’ve turned that off.

As the node count goes up, elections need to start happening for real (so you need to hear from everyone and have them all agree on a winner) but still it should be pretty quick.
The policy engine will take incrementally longer because it has more nodes to loop through, but that should be negligible on the scale that corosync can operate at.

I’d be interested to know what log messages you’re basing your timing numbers on.

> 
> I'm using one ring only. It looks as the usage of two rings do not change the test results a lot.
> 
> Thank you,
> 
> Stefan
> 
> [1] http://kamailio.org/docs/modules/devel/modules/dmq.html
> 
> _______________________________________________
> discuss mailing list
> discuss@xxxxxxxxxxxx
> http://lists.corosync.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss