Ceph newbie here; ceph 0.94.2, CentOS 6.6 x86_64. Kernel 2.6.32.
Initial test cluster of five OSD nodes, 3 MON, 1 MDS. Working well. I was
testing the removal of two MONs, just to see how it works. The second MON
was stopped and removed: no problems. The third MON was stopped and
removed: apparently no problems, and ceph told me that only one MON
remained. However, a "ceph -s", along with many other commands, now hang
for 5 minutes and then give me an authentication timeout. On the initial
MON node, anderson, I get:
# ceph daemon mon.anderson mon_status
{
"name": "anderson",
"rank": 1,
"state": "probing",
"election_epoch": 0,
"quorum": [],
"outside_quorum": [
"anderson"
],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": {
"epoch": 4,
"fsid": "b9aeb134-fe63-46b4-a939-152a6c188f6a",
"modified": "2015-07-07 17:18:02.816853",
"created": "0.000000",
"mons": [
{
"rank": 0,
"name": "benford",
"addr": "10.22.200.13:6789\/0"
},
{
"rank": 1,
"name": "anderson",
"addr": "10.22.200.16:6789\/0"
}
]
}
}
So, no quorum. Here benford is the third MON that was already removed.
This removal, which initially appeared to work, evidently did not complete
fully. I cannot start a MON on benford, however ("mon.benford not present
in monmap"). I cannot start the OSD's on any node.
How do I recover from this situation?
Steve
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com