Re: Only 2/5 mon services running

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It looks like the second mon server was down from my reboot.  Restarted and everything is functional again but I still can’t figure out why only 2 out of the 5 mon servers is down and won’t start.  If they were functioning, I probably wouldn’t have noticing the cluster being down.

Thanks
-jeremy


> On Jun 7, 2021, at 7:53 PM, Jeremy Hansen <jeremy@xxxxxxxxxx> wrote:
> 
> Signed PGP part
> 
> In an attempt to troubleshoot why only 2/5 mon services were running, I believe I’ve broke something:
> 
> [ceph: root@cn01 /]# ceph orch ls
> NAME                       PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
> alertmanager                          1/1  81s ago    9d   count:1
> crash                                 6/6  7m ago     9d   *
> grafana                               1/1  80s ago    9d   count:1
> mds.testfs                            2/2  81s ago    9d   cn01.ceph.la1.clx.corp;cn02.ceph.la1.clx.corp;cn03.ceph.la1.clx.corp;cn04.ceph.la1.clx.corp;cn05.ceph.la1.clx.corp;cn06.ceph.la1.clx.corp;count:2
> mgr                                   2/2  81s ago    9d   count:2
> mon                                   2/5  81s ago    9d   count:5
> node-exporter                         6/6  7m ago     9d   *
> osd.all-available-devices           20/26  7m ago     9d   *
> osd.unmanaged                         7/7  7m ago     -    <unmanaged>
> prometheus                            2/2  80s ago    9d   count:2
> 
> I tried to stop and start the mon service, but now the cluster is pretty much unresponsive, I’m assuming because I stopped mon:
> 
> [ceph: root@cn01 /]# ceph orch stop mon
> Scheduled to stop mon.cn01 on host 'cn01.ceph.la1.clx.corp'
> Scheduled to stop mon.cn02 on host 'cn02.ceph.la1.clx.corp'
> Scheduled to stop mon.cn03 on host 'cn03.ceph.la1.clx.corp'
> Scheduled to stop mon.cn04 on host 'cn04.ceph.la1.clx.corp'
> Scheduled to stop mon.cn05 on host 'cn05.ceph.la1.clx.corp'
> [ceph: root@cn01 /]# ceph orch start mon
> 
> 
> ^CCluster connection aborted
> 
> 
> Now even after a reboot of the cluster, it’s unresponsive.  How do I get mon started again?
> 
> I’m going through Ceph and breaking things left and right, so I apologize for all the questions.  I learn best from breaking things and figuring out how to resolve the issues.
> 
> 
> Thank you
> -jeremy
> 
> 

Attachment: signature.asc
Description: Message signed with OpenPGP

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux