Hi,
You mention early that one mon is spare but later say that you have 4 mons running. Ceph explicitly requires an odd number of monitors so a quorum can be established.
I think you should have 3 or 5 mons running not 3 and a spare.
Good luck
On 14/05/2013 12:58 PM, "Mr. NPP" <mr.npp@xxxxxxxxxxxxxxxxxxx> wrote:
hello, i'm currently running 0.61, with about 44 osd's and 4 monitors, one as a spare.with about 6 hosts.I've been running into an issue where when one ceph host would go down the entire system become unusable. today we recovered from a ssd crash crash for an osd's journal, and it was a lot of work to get it back up, i couldn't get monitors to come up and establish quorum. I was going to rebuild it manually, but the documentation for ceph is outdated to manually (dirty) remove a monitor using the monmap tool, i couldn't find the /mon-$id/monmap directory.anyway, I recovered eventually and was able to run with 4 monitors, and i updated the crushmap and it crashed the monitor that i was updating the crushmap too.it now gives me[976]: (33) Numerical argument out of domainwhen i try to manually start it, i've seen this assert failure before, just not sure whats causing it.below i the log from the crash.i'm not even really sure if my configs are right, i'm still pretty new at this.below are the configs, and the last mapceph.confcrush.map.txtif you need additional dumps from the monitor i can get it.thanksmr.npp
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com