Keepalived - spurious failovers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



Hello,

We are using CentOS 6.6 and keepalived 1.2.13 on two servers for
failover, no load-balancing. Failover is governed by the NIC being
present, and the Apache and Tomcat processes being present. Both servers
are configured as 'EQUAL' (not master/backup). An initial priority of
100 is set, and if a process or NIC fails, then this is reduced by 60 -
causing a lower priority to be seen and failover to take place.
Generally this works well. If we stop the network or one of the
processes, this is logged (to /var/log/messages) and failover happens
within a few seconds.

However, we have had failovers occur during the night several times. It
happened last night, and the night before. Nothing was logged in the
messages file about the NIC being down, or the Apache/Tomcat processes
being unavailable. Nothing was logged by the Apache or Tomcat processes
in their own log files. The failovers have happened at 03:56 on both
nights.

The most obvious suspect causing this would be some nighttime process
such as log rotation or automatic updates. However, I can see nothing
obvious occurring during the night that would cause the keepalived
virtual interface to failover.

The messages log file typically shows:

On the previous master, now slave server...
===========================
Nov 12 03:56:40 bill Keepalived_vrrp[27279]: VRRP_Instance(Shib_srvrs)
Transition to MASTER STATE
Nov 12 03:56:43 bill Keepalived_vrrp[27279]: VRRP_Instance(Shib_srvrs)
Entering MASTER STATE
Nov 12 03:56:43 bill Keepalived_vrrp[27279]: VRRP_Instance(Shib_srvrs)
setting protocol VIPs.
Nov 12 03:56:43 bill Keepalived_vrrp[27279]: VRRP_Instance(Shib_srvrs)
Sending gratuitous ARPs on eth0 for xxx.xxx.xxx.xxx
Nov 12 03:56:48 bill Keepalived_vrrp[27279]: VRRP_Instance(Shib_srvrs)
Sending gratuitous ARPs on eth0 for xxx.xxx.xxx.xxx
Nov 12 03:56:51 bill Keepalived_vrrp[27279]: VRRP_Instance(Shib_srvrs)
Received higher prio advert
Nov 12 03:56:51 bill Keepalived_vrrp[27279]: VRRP_Instance(Shib_srvrs)
Entering BACKUP STATE
Nov 12 03:56:51 bill Keepalived_vrrp[27279]: VRRP_Instance(Shib_srvrs)
removing protocol VIPs.
==========================

On the previous slave, now master server, there is nothing logged at (or
around) this time at all.

As the previous master log shows it 'Received higher prio advert'. But
that implies that the priority on the server is lower, and no indication
why.

Has anyone seen this themselves? Or have any idea why it may be
occurring? As said, some nighttime process seems to be the cause, but I
cannot think or find anything that would cause it.



Thanks,

John.

-- 
John Horne                   Tel: +44 (0)1752 587287
Plymouth University, UK

_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos




[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux