In an attempt to troubleshoot why only 2/5 mon services were running, I believe I’ve broke something: [ceph: root@cn01 /]# ceph orch ls NAME PORTS RUNNING REFRESHED AGE PLACEMENT alertmanager 1/1 81s ago 9d count:1 crash 6/6 7m ago 9d * grafana 1/1 80s ago 9d count:1 mds.testfs 2/2 81s ago 9d cn01.ceph.la1.clx.corp;cn02.ceph.la1.clx.corp;cn03.ceph.la1.clx.corp;cn04.ceph.la1.clx.corp;cn05.ceph.la1.clx.corp;cn06.ceph.la1.clx.corp;count:2 mgr 2/2 81s ago 9d count:2 mon 2/5 81s ago 9d count:5 node-exporter 6/6 7m ago 9d * osd.all-available-devices 20/26 7m ago 9d * osd.unmanaged 7/7 7m ago - <unmanaged> prometheus 2/2 80s ago 9d count:2 I tried to stop and start the mon service, but now the cluster is pretty much unresponsive, I’m assuming because I stopped mon: [ceph: root@cn01 /]# ceph orch stop mon Scheduled to stop mon.cn01 on host 'cn01.ceph.la1.clx.corp' Scheduled to stop mon.cn02 on host 'cn02.ceph.la1.clx.corp' Scheduled to stop mon.cn03 on host 'cn03.ceph.la1.clx.corp' Scheduled to stop mon.cn04 on host 'cn04.ceph.la1.clx.corp' Scheduled to stop mon.cn05 on host 'cn05.ceph.la1.clx.corp' [ceph: root@cn01 /]# ceph orch start mon ^CCluster connection aborted Now even after a reboot of the cluster, it’s unresponsive. How do I get mon started again? I’m going through Ceph and breaking things left and right, so I apologize for all the questions. I learn best from breaking things and figuring out how to resolve the issues. Thank you -jeremy
Attachment:
signature.asc
Description: Message signed with OpenPGP
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx