Hi, I have a ceph-mon problem. My cluster stopped working after I had to restart the machines, which it is installed on. My setup includes 3 hosts, one of which is currently down, but the cluster remained healthy
since the other two could build a quorum just fine. No I restarted the hosts, because of a planned power outtage and the monitor on my 2nd node won't come up again. As a result the OSD daemons won't start either (naturally).
My ceph-mon log reads as follows: 2016-09-06 11:15:30.237519 7f8aeb1eb880 -1 error opening mon data directory at '/var/lib/ceph/mon/ceph-lanai': (22) Invalid argument 2016-09-06 11:22:54.777162 7f9f7411e880 0 ceph version 9.2.0 (bb2ecea240f3a1d525bcb35670cb07bd1f0ca299), process ceph-mon, pid 11486" Additionally I get a 'ceph-mon[2011]: Corruption: checksum mismatch' message, when trying to start it manually. Howevery the monitor’s directory is quite accessible: [root@lanai ceph-lanai]# ll /var/lib/ceph/mon/ceph-lanai/ insgesamt 8 -rw-r--r-- 1 ceph ceph 0 23. Feb 2016 done -rw-r--r-- 1 ceph ceph 77 23. Feb 2016 keyring drwxr-xr-x 2 ceph ceph 4096 6. Sep 11:15 store.db -rw-r--r-- 1 ceph ceph 0 23. Feb 2016 systemd So it is some file corruption? I already read about recreating the monitor, but am unsure whether thats possible or not, because I don’t have a quorum atm?
I am running CEPH 9.2.0 on CentOS 7a and installed it via the Quick Start Guide. Any help would be greatly appreciated! Thanks! Christian |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com