Hello,
I just deployed a new Emperor cluster using ceph-deploy 1.4. All went very smooth, until I rebooted all the nodes. After reboot, the monitors no longer form a quorum.
I followed the troubleshooting steps here: http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-mon/
Specifically, I"m in the stat described in this section: http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-mon/#most-common-monitor-issuesI just deployed a new Emperor cluster using ceph-deploy 1.4. All went very smooth, until I rebooted all the nodes. After reboot, the monitors no longer form a quorum.
I followed the troubleshooting steps here: http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-mon/
The state for all the monitors is "electing". The docs say this is most likely clock skew, but I do have all nodes synch'd with NTP. I've confirmed this multiple times. I've also confirmed the monitors can reach each other (by telneting to IP:PORT, and I can see established connections via netstat).
I'm baffled.
here is a sample mon_status output:
root@ceph0:~# ceph daemon mon.ceph0 quorum_status
{ "election_epoch": 31,
"quorum": [],
"quorum_names": [],
"quorum_leader_name": "",
"monmap": { "epoch": 2,
"fsid": "XXX", (redacted)
"modified": "2014-03-24 14:35:22.332646",
"created": "0.000000",
"mons": [
{ "rank": 0,
"name": "ceph0",
"addr": "10.10.30.0:6789\/0"},
{ "rank": 1,
"name": "ceph1",
"addr": "10.10.30.1:6789\/0"},
{ "rank": 2,
"name": "ceph2",
"addr": "10.10.30.2:6789\/0"}]}}
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com