Mimic upgrade 13.2.1 > 13.2.2 monmap changed

Nino Bosteels <n.bosteels@xxxxxxxxxxxxx> · Thu, 4 Oct 2018 12:39:38 +0000

Hello list,

I’m having a serious issue, since my ceph cluster has become unresponsive. I was upgrading my cluster (3 servers, 3 monitors) from 13.2.1 to 13.2.2, which shouldn’t be a problem.

Though on reboot my first host reported:

starting mon.ceph01 rank -1 at 192.168.200.197:6789/0 mon_data /var/lib/ceph/mon/ceph-ceph01 fsid 27dd45f1-28b5-4ac6-81ab-c62bc581130c
mon.cephxx@-1(probing) e5 preinit fsid 27dd45f1-28b5-4ac6-81ab-c62bc581130c
mon.cephxx@-1(probing) e5 not in monmap and have been in a quorum before; must have been removed
-1 mon.cephxx@-1(probing) e5 commit suicide!
-1 failed to initialize

I thought, perhaps the monitor doesn’t want to accept the monmap of the other 2, because of the version-difference. Sadly, I upgraded and rebooted the second server.

Since the cluster is unresponsive (because more than half of the monitors is offline / out of quorum). The logs of my second host, it keeps spamming:

2018-10-04 14:39:06.802 7fed0058f700 -1 mon.ceph02@1(probing) e6 get_health_metrics reporting 14 slow ops, oldest is auth(proto 0 27 bytes epoch 6)

Any help VERY MUCH appreciated, this sucks.

Thanks

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com