On Thu, May 31, 2018 at 1:49 PM Leônidas Villeneuve <leonidas@xxxxxxxxxxxxx> wrote:
I had a small Ceph cluster and had to take down one node. The data from its OSDs was reallocated on the other OSDs and went fine._______________________________________________After the reallocation, I removed its mon.service as described by the official documentation.Then, everything went wrong. The other mons just collapsed and stopped talking to mgrs. The mgr dashboard still works but has outdated data. The osds are still up and rbd volumes are working too, but the mons can't get online.After trying everything described by the troubleshooter, removing the old mon from monmap, I couldn't inject the new monmap because of lock errors in store.db. When I finally injected the new monmap, the mon refused to get up. I tried this setting on other mons and got the same results. And, to my despair, the store.db ended up being corrupted.I finally gave up and (after backing up the store.db), deleted the mons and started fresh new ones. That worked, but the new mons now have no OSDs or hosts mapped to them. I have an old crush map and that's all.But, since the OSDs are still up, is it possible to rebuild the map and all the data needed for mons to start working again from then? That's the last resource I have.Putting it in another way, I have OSDs services and OSD data but no monitor and no mgr and need to put them back running. Any tips will be appreciated.Thanks.
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com