Re: Spurious MON re-elections

Gregory Farnum <greg@xxxxxxxxxxx> · Wed, 1 Apr 2015 10:06:36 -0700

On Wed, Apr 1, 2015 at 5:03 AM, Sylvain Munaut
<s.munaut@xxxxxxxxxxxxxxxxxxxx> wrote:
> Hi,
>
>
> For some unknown reason, periodically, the master is kicked out and
> another one becomes leader. And then a couple second later, the
> original master calls for re-election and becomes leader again.
>
> This also seems to cause some load even after the original master is
> back. Here's a couple of graphs from the monitor at one such event :
>
>   CPU load: http://i.imgur.com/7byRYhL.png
>   Memory:   http://i.imgur.com/4I0iE0l.png
>
> I raised the paxos debug to 5 and this is what happens on mon.a & mon.b :
>
>   The short version just around the event: http://pastebin.com/h3AhHhHb
>   The longer/full logs are available at http://ge.tt/2hMgZTD2
>
>
> Any explanation of what's happening and how to prevent it ?

Notice the "lease timeout" note in mon.b? It's unhappy because the
leader didn't update it recently enough that mon.b can keep serving
reads, so mon.b called an election on the presumption that mon.a died.

And indeed there's nothing in the log for mon.a between 17:49:32.77602
and 17:50:10.929258, which seems not great. I'd look and see if
something is happening with your disks, maybe?

Based on your graphs, actually, the CPU load you're seeing is probably
the cause, not the effect. An election can increase load some if a
bunch of client messages get piled up and need to be processed, but
otherwise it's just a couple messages and a hiccup in processing...
-Greg

>
>
> I can post more info on request. I'm also available on IRC ( nick
> 'tnt' ) for live debug if needed :p
>
>
> Cheers,
>
>     Sylvain Munaut
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com