[URGENT] Rebuilding cluster data from remaining OSDs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I had a small Ceph cluster and had to take down one node. The data from its OSDs was reallocated on the other OSDs and went fine.

After the reallocation, I removed its mon.service as described by the official documentation. 

Then, everything went wrong. The other mons just collapsed and stopped talking to mgrs. The mgr dashboard still works but has outdated data. The osds are still up and rbd volumes are working too, but the mons can't get online.

After trying everything described by the troubleshooter, removing the old mon from monmap, I couldn't inject the new monmap because of lock errors in store.db. When I finally injected the new monmap, the mon refused to get up. I tried this setting on other mons and got the same results. And, to my despair, the store.db ended up being corrupted.

I finally gave up and (after backing up the store.db), deleted the mons and started fresh new ones. That worked, but the new mons now have no OSDs or hosts mapped to them. I have an old crush map and that's all.

But, since the OSDs are still up, is it possible to rebuild the map and all the data needed for mons to start working again from then? That's the last resource I have.

Putting it in another way, I have OSDs services and OSD data but no monitor and no mgr and need to put them back running. Any tips will be appreciated.

Thanks.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux