Yoo have conflated "ceph orch host add" and "ceph orch host label add" This is not valid syntax: ceph orch host add osdev-ctrl3 mon The docs imply that the trailing "mon" would be ignored, since there is no 6th argument to that command. https://docs.ceph.com/en/latest/cephadm/host-management/ you have to run a separate ceph orch host label add osdev-ctrl3 mon ----- Original Message ----- From: "Gary Molenkamp" <molenkam@xxxxxx> To: "ceph-users" <ceph-users@xxxxxxx> Sent: Wednesday, March 31, 2021 8:12:53 AM Subject: understanding orchestration and cephadm Good morning all, I'm experimenting with ceph orchestration and cephadm after using ceph-deploy for several years, and I have a hopefully simple question. I've converted a basic nautilus cluster over to cephadm+orchestration and I tried adding, then removing a monitor. However, when I removed the host using 'ceph orch host rm', it removed two mons. I may have missed something in the adoption/upgrade that has left the cluster in a bad state. Any advice/pointers/clarification would be of assistance. Details: A nautilus cluster with two mons (I know this is not correct for quorum), a mgr, and a handful of osds. I went though the adoption process and enabled the ceph orch backend. [root@osdev-ctrl2 ~]# ceph orch ps NAME HOST STATUS REFRESHED AGE VERSION IMAGE NAME IMAGE ID CONTAINER ID mgr.osdev-ctrl2 osdev-ctrl2 running (18h) 50s ago 18h 15.2.10 docker.io/ceph/ceph:v15.2.10 5b724076c58f e73c19b51a09 mon.osdev-ctrl2 osdev-ctrl2 running (18h) 50s ago 18h 15.2.10 docker.io/ceph/ceph:v15.2.10 5b724076c58f a6bfc27221f0 mon.osdev-net1 osdev-net1 running (18h) 50s ago 18h 15.2.10 docker.io/ceph/ceph:v15.2.10 5b724076c58f f66e2bef3d44 osd.0 osdev-stor1 running (17h) 50s ago 17h 15.2.10 docker.io/ceph/ceph:v15.2.10 5b724076c58f ac59dbdc267c ... [root@osdev-ctrl2 ~]# ceph orch status Backend: cephadm Available: True [root@osdev-ctrl2 ~]# ceph orch host ls HOST ADDR LABELS STATUS osdev-ctrl2 osdev-ctrl2 mon mgr osdev-net1 osdev-net1 mon osdev-stor1 osdev-stor1 osd [root@osdev-ctrl2 ~]# ceph orch ls NAME RUNNING REFRESHED AGE PLACEMENT IMAGE NAME IMAGE ID mgr 1/1 9m ago 20h label:mgr docker.io/ceph/ceph:v15.2.10 5b724076c58f mon 2/2 9m ago 20h label:mon docker.io/ceph/ceph:v15.2.10 5b724076c58f I then added a new mon host: [root@osdev-ctrl2 ~]# ceph orch host add osdev-ctrl3 mon It did not spawn a mon container on osdev-ctrl3 until I defined the public network in the config: [root@osdev-ctrl2 ~]# ceph config set global public_network 10.10.10.0/24 At this point all is good with three running mon as expected. Now I wanted to delete the mon using [root@osdev-ctrl2 ~]# ceph orch host rm osdev-ctrl3 This had the effect of: 1. removing the osdev-ctrl3 mon from 'ceph orch ls' and 'ceph orch ps' 2. the mon on osdev-ctrl3 is still running, and is part of 'ceph -s' but reported as not managed by cephadm 3. (Big issue) the mon running on osdev-net1 was completely destroyed. Any ideas what is going on? Sorry for the long post, but I tried to be as clear as possible. -- Gary Molenkamp Computer Science/Science Technology Services Systems Administrator University of Western Ontario molenkam@xxxxxx http://www.csd.uwo.ca (519) 661-2111 x86882 (519) 661-3566 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx