Re: [CLUSTER STUCK] Luminous cluster stuck when adding monitor

Nico Schottelius <nico.schottelius@xxxxxxxxxxx> · Sun, 08 Oct 2017 19:42:30 +0200

After spending some hours on debugging packets on the wire, without
seeing a good reason for things not to work, the monitor on server2
eventually joined the quorum.

Being happy for some time and then our alarming sends a message that the
quorum is lost. And indeed, the monitor on server2 died and now comes
the not so funny part: restarting the monitor makes the cluster hang again.

I will post another debug log in the next hours, now from the monitor on
server2.

Nico Schottelius <nico.schottelius@xxxxxxxxxxx> writes:

> Not sure if I mentioned before: adding a new monitor also puts the whole
> cluster into stuck state.
>
> Some minutes ago I did:
>
> root@server1:~# ceph mon add server2 2a0a:e5c0::92e2:baff:fe4e:6614
> port defaulted to 6789; adding mon.server2 at [2a0a:e5c0::92e2:baff:fe4e:6614]:6789/0
>
> And then started the daemon on server2:
>
> ceph-mon -i server2 --pid-file  /var/lib/ceph/run/mon.server2.pid -c /etc/ceph/ceph.conf --cluster ceph --setuser ceph --setgroup ceph -d 2>&1 | tee ~/cephmonlog-2017-10-08-2
>
> And now the cluster hangs (as in ceph -s does not return).
>
> Looking at mon_status of server5, shows that server5 thinks it is time
> for electing [0].
>
> When stopping the monitor on server2 and trying to remove server2 again,
> the removal command also gets stuck and never returns:
>
> root@server1:~# ceph mon rm server2
>
> As our cluster is now severely degraded, I was wondering if anyone has a
> quick hint on how to get ceph -s back working and/or remove server2
> and/or how to readd server1?
>
> Best,
>
> Nico
>
>
> [0]
>
> [10:50:38] server5:~# ceph daemon mon.server5 mon_status
> {
>     "name": "server5",
>     "rank": 0,
>     "state": "electing",
>     "election_epoch": 6087,
>     "quorum": [],
>     "features": {
>         "required_con": "153140804152475648",
>         "required_mon": [
>             "kraken",
>             "luminous"
>         ],
>         "quorum_con": "2305244844532236283",
>         "quorum_mon": [
>             "kraken",
>             "luminous"
>         ]
>     },
>     "outside_quorum": [],
>     "extra_probe_peers": [
>         "[2a0a:e5c0::92e2:baff:fe4e:6614]:6789/0"
>     ],
>     "sync_provider": [],
>     "monmap": {
>         "epoch": 11,
>         "fsid": "26c0c5a8-d7ce-49ac-b5a7-bfd9d0ba81ab",
>         "modified": "2017-10-08 10:43:49.667986",
>         "created": "2017-05-16 22:33:04.500528",
>         "features": {
>             "persistent": [
>                 "kraken",
>                 "luminous"
>             ],
>             "optional": []
>         },
>         "mons": [
>             {
>                 "rank": 0,
>                 "name": "server5",
>                 "addr": "[2a0a:e5c0::21b:21ff:fe85:a3a2]:6789/0",
>                 "public_addr": "[2a0a:e5c0::21b:21ff:fe85:a3a2]:6789/0"
>             },
>             {
>                 "rank": 1,
>                 "name": "server3",
>                 "addr": "[2a0a:e5c0::21b:21ff:fe85:a42a]:6789/0",
>                 "public_addr": "[2a0a:e5c0::21b:21ff:fe85:a42a]:6789/0"
>             },
>             {
>                 "rank": 2,
>                 "name": "server2",
>                 "addr": "[2a0a:e5c0::92e2:baff:fe4e:6614]:6789/0",
>                 "public_addr": "[2a0a:e5c0::92e2:baff:fe4e:6614]:6789/0"
>             },
>             {
>                 "rank": 3,
>                 "name": "server1",
>                 "addr": "[2a0a:e5c0::92e2:baff:fe8a:2e78]:6789/0",
>                 "public_addr": "[2a0a:e5c0::92e2:baff:fe8a:2e78]:6789/0"
>             }
>         ]
>     },
>     "feature_map": {
>         "mon": {
>             "group": {
>                 "features": "0x1ffddff8eea4fffb",
>                 "release": "luminous",
>                 "num": 1
>             }
>         },
>         "client": {
>             "group": {
>                 "features": "0x1ffddff8eea4fffb",
>                 "release": "luminous",
>                 "num": 4
>             }
>         }
>     }
> }
>
>
>
>
> Nico Schottelius <nico.schottelius@xxxxxxxxxxx> writes:
>
>> Good evening Joao,
>>
>> we double checked our MTUs, they are all 9200 on the servers and 9212 on
>> the switches. And we have no problems transferring big files in general
>> (as opennebula copies around images for importing, we do this quite a
>> lot).
>>
>> So if you could have a look, it would be much appreciated.
>>
>> If we should collect other logs, just let us know.
>>
>> Best,
>>
>> Nico
>>
>> Joao Eduardo Luis <joao@xxxxxxx> writes:
>>
>>> On 10/04/2017 09:19 PM, Gregory Farnum wrote:
>>>> Oh, hmm, you're right. I see synchronization starts but it seems to
>>>> progress very slowly, and it certainly doesn't complete in that 2.5
>>>> minute logging window. I don't see any clear reason why it's so
>>>> slow; it might be more clear if you could provide logs of the other
>>>> logs at the same time (especially since you now say they are getting
>>>> stuck in the electing state during that period). Perhaps Kefu or
>>>> Joao will have some clearer idea what the problem is.
>>>> -Greg
>>>
>>> I haven't gone through logs yet (maybe Friday, it's late today and
>>> it's a holiday tomorrow), but not so long ago I seem to recall someone
>>> having a similar issue with the monitors that was solely related to a
>>> switch's MTU being too small.
>>>
>>> Maybe that could be the case? If not, I'll take a look at the logs as
>>> soon as possible.
>>>
>>>   -Joao
>>>
>>>>
>>>> On Wed, Oct 4, 2017 at 1:04 PM Nico Schottelius
>>>> <nico.schottelius@xxxxxxxxxxx <mailto:nico.schottelius@xxxxxxxxxxx>>
>>>> wrote:
>>>>
>>>>
>>>>     Some more detail:
>>>>
>>>>     when restarting the monitor on server1, it stays in synchronizing state
>>>>     forever.
>>>>
>>>>     However the other two monitors change into electing state.
>>>>
>>>>     I have double checked that there are not (host) firewalls active and
>>>>     that the times are within 1 second different of the hosts (they all have
>>>>     ntpd running).
>>>>
>>>>     We are running everything on IPv6, but this should not be a problem,
>>>>     should it?
>>>>
>>>>     Best,
>>>>
>>>>     Nico
>>>>
>>>>
>>>>     Nico Schottelius <nico.schottelius@xxxxxxxxxxx
>>>>     <mailto:nico.schottelius@xxxxxxxxxxx>> writes:
>>>>
>>>>      > Hello Gregory,
>>>>      >
>>>>      > the logfile I produced has already debug mon = 20 set:
>>>>      >
>>>>      > [21:03:51] server1:~# grep "debug mon" /etc/ceph/ceph.conf
>>>>      > debug mon = 20
>>>>      >
>>>>      > It is clear that server1 is out of quorum, however how do we make it
>>>>      > being part of the quorum again?
>>>>      >
>>>>      > I expected that the quorum finding process is triggered automatically
>>>>      > after restarting the monitor, or is that incorrect?
>>>>      >
>>>>      > Best,
>>>>      >
>>>>      > Nico
>>>>      >
>>>>      >
>>>>      > Gregory Farnum <gfarnum@xxxxxxxxxx <mailto:gfarnum@xxxxxxxxxx>>
>>>>     writes:
>>>>      >
>>>>      >> You'll need to change the config so that it's running "debug mon
>>>>     = 20" for
>>>>      >> the log to be very useful here. It does say that it's dropping
>>>>     client
>>>>      >> connections because it's been out of quorum for too long, which
>>>>     is the
>>>>      >> correct behavior in general. I'd imagine that you've got clients
>>>>     trying to
>>>>      >> connect to the new monitor instead of the ones already in the
>>>>     quorum and
>>>>      >> not passing around correctly; this is all configurable.
>>>>      >>
>>>>      >> On Wed, Oct 4, 2017 at 4:09 AM Nico Schottelius <
>>>>      >> nico.schottelius@xxxxxxxxxxx
>>>>     <mailto:nico.schottelius@xxxxxxxxxxx>> wrote:
>>>>      >>
>>>>      >>>
>>>>      >>> Good morning,
>>>>      >>>
>>>>      >>> we have recently upgraded our kraken cluster to luminous and
>>>>     since then
>>>>      >>> noticed an odd behaviour: we cannot add a monitor anymore.
>>>>      >>>
>>>>      >>> As soon as we start a new monitor (server2), ceph -s and ceph
>>>>     -w start to
>>>>      >>> hang.
>>>>      >>>
>>>>      >>> The situation became worse, since one of our staff stopped an
>>>>     existing
>>>>      >>> monitor (server1), as restarting that monitor results in the same
>>>>      >>> situation, ceph -s hangs until we stop the monitor again.
>>>>      >>>
>>>>      >>> We kept the monitor running for some minutes, but the situation
>>>>     never
>>>>      >>> cleares up.
>>>>      >>>
>>>>      >>> The network does not have any firewall in between the nodes and
>>>>     there
>>>>      >>> are no host firewalls.
>>>>      >>>
>>>>      >>> I have attached the output of the monitor on server1, running in
>>>>      >>> foreground using
>>>>      >>>
>>>>      >>> root@server1:~# ceph-mon -i server1 --pid-file
>>>>      >>> /var/lib/ceph/run/mon.server1.pid -c /etc/ceph/ceph.conf
>>>>     --cluster ceph
>>>>      >>> --setuser ceph --setgroup ceph -d 2>&1 | tee cephmonlog
>>>>      >>>
>>>>      >>> Does anyone see any obvious problem in the attached log?
>>>>      >>>
>>>>      >>> Any input or hint would be appreciated!
>>>>      >>>
>>>>      >>> Best,
>>>>      >>>
>>>>      >>> Nico
>>>>      >>>
>>>>      >>>
>>>>      >>>
>>>>      >>> --
>>>>      >>> Modern, affordable, Swiss Virtual Machines. Visit
>>>>     www.datacenterlight.ch <http://www.datacenterlight.ch>
>>>>      >>> _______________________________________________
>>>>      >>> ceph-users mailing list
>>>>      >>> ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>>>>      >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>      >>>
>>>>
>>>>
>>>>     --
>>>>     Modern, affordable, Swiss Virtual Machines. Visit
>>>>     www.datacenterlight.ch <http://www.datacenterlight.ch>
>>>>

--
Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com