Re: Monitors not reaching quorum

"Sergio A. de Carvalho Jr." <scarvalhojr@xxxxxxxxx> · Mon, 25 Jul 2016 16:17:27 +0100

We're having problems to start the 5th host (some BIOS problem, possibly), so I won't be able to recover its monitor any time soon.
I knew having an even number of monitors wasn't ideal, and that's why I started 3 monitors first and waited until they reached quorum before starting the 4th monitor. I was hoping that once quorum was established, the 4th monitor would simply join the other 3, instead of calling for new elections. I didn't think having an odd number of monitors was a hard requirement.

I'm wondering if having one dead monitor in the map is complicating the election.

On Mon, Jul 25, 2016 at 3:45 PM, Joshua M. Boniface <joshua@xxxxxxxxxxx> wrote:
My understanding is that you need an odd number of monitors to reach quorum. This seems to match what you're seeing: with 3, there is a definite leader, but with 4, there isn't. Have you tried starting both the 4th and 5th simultaneously and letting them both vote?

--

Joshua M. Boniface

Linux System Ærchitect

Sigmentation fault. Core dumped.

On 25/07/16 10:41 AM, Sergio A. de Carvalho Jr. wrote:

> In the logs, there 2 monitors are constantly reporting that they won the leader election:

>

> 60z0m02 (monitor 0):

> 2016-07-25 14:31:11.644335 7f8760af7700  0 log_channel(cluster) log [INF] : mon.60z0m02@0 won leader election with quorum 0,2,4

> 2016-07-25 14:31:44.521552 7f8760af7700  1 mon.60z0m02@0(leader).paxos(paxos recovering c 1318755..1319320) collect timeout, calling fresh election

>

> 60zxl02 (monitor 1):

> 2016-07-25 14:31:59.542346 7fefdeaed700  1 mon.60zxl02@1(electing).elector(11441) init, last seen epoch 11441

> 2016-07-25 14:32:04.583929 7fefdf4ee700  0 log_channel(cluster) log [INF] : mon.60zxl02@1 won leader election with quorum 1,2,4

> 2016-07-25 14:32:33.440103 7fefdf4ee700  1 mon.60zxl02@1(leader).paxos(paxos recovering c 1318755..1319319) collect timeout, calling fresh election

>

>

> On Mon, Jul 25, 2016 at 3:27 PM, Sergio A. de Carvalho Jr. <scarvalhojr@xxxxxxxxx <mailto:scarvalhojr@xxxxxxxxx>> wrote:

>

>     Hi,

>

>     I have a cluster of 5 hosts running Ceph 0.94.6 on CentOS 6.5. On each host, there is 1 monitor and 13 OSDs. We had an issue with the network and for some reason (which I still don't know why), the servers were restarted. One host is still down, but the monitors on the 4 remaining servers are failing to enter a quorum.

>

>     I managed to get a quorum of 3 monitors by stopping all Ceph monitors and OSDs across all machines, and bringing up the top 3 ranked monitors in order of rank. After a few minutes, the 60z0m02 monitor (the top ranked one) became the leader:

>

>     {

>         "name": "60z0m02",

>         "rank": 0,

>         "state": "leader",

>         "election_epoch": 11328,

>         "quorum": [

>             0,

>             1,

>             2

>         ],

>         "outside_quorum": [],

>         "extra_probe_peers": [],

>         "sync_provider": [],

>         "monmap": {

>             "epoch": 5,

>             "fsid": "2f51a247-3155-4bcf-9aee-c6f6b2c5e2af",

>             "modified": "2016-04-28 22:26:48.604393",

>             "created": "0.000000",

>             "mons": [

>                 {

>                     "rank": 0,

>                     "name": "60z0m02",

>                     "addr": "10.98.2.166:6789 <http://10.98.2.166:6789>\/0"

>                 },

>                 {

>                     "rank": 1,

>                     "name": "60zxl02",

>                     "addr": "10.98.2.167:6789 <http://10.98.2.167:6789>\/0"

>                 },

>                 {

>                     "rank": 2,

>                     "name": "610wl02",

>                     "addr": "10.98.2.173:6789 <http://10.98.2.173:6789>\/0"

>                 },

>                 {

>                     "rank": 3,

>                     "name": "618yl02",

>                     "addr": "10.98.2.214:6789 <http://10.98.2.214:6789>\/0"

>                 },

>                 {

>                     "rank": 4,

>                     "name": "615yl02",

>                     "addr": "10.98.2.216:6789 <http://10.98.2.216:6789>\/0"

>                 }

>             ]

>         }

>     }

>

>     The other 2 monitors became peons:

>

>     "name": "60zxl02",

>         "rank": 1,

>         "state": "peon",

>         "election_epoch": 11328,

>         "quorum": [

>             0,

>             1,

>             2

>         ],

>

>     "name": "610wl02",

>         "rank": 2,

>         "state": "peon",

>         "election_epoch": 11328,

>         "quorum": [

>             0,

>             1,

>             2

>         ],

>

>     I then proceeded to start the fourth monitor, 615yl02 (618yl02 is powered off), but after more than 2 hours and several election rounds, the monitors still haven't reached a quorum. The monitors alternate mostly between "election", "probing" states but they often seem to be in different election epochs.

>

>     Is this normal?

>

>     Is there anything I can do to help the monitors elect a leader? Should I manually remove the dead host's monitor from the monitor map?

>

>     I left all OSD daemons stopped while the election is going on purpose. Is this the best thing to do? Would bringing the OSDs up help or complicate matters even more? Or doesn't it make any difference?

>

>     I don't see anything obviously wrong in the monitor logs. They're mostly filled with messages like the following:

>

>     2016-07-25 14:17:57.806148 7fc1b3f7e700  1 mon.610wl02@2(electing).elector(11411) init, last seen epoch 11411

>     2016-07-25 14:17:57.829198 7fc1b7caf700  0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch

>     2016-07-25 14:17:57.829200 7fc1b7caf700  0 log_channel(audit) do_log log to syslog

>     2016-07-25 14:17:57.829254 7fc1b7caf700  0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished

>

>     Any help would be hugely appreciated.

>

>     Thanks,

>

>     Sergio

>

>

>

>

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com