Good morning all,
I'm experimenting with ceph orchestration and cephadm after using
ceph-deploy for several years, and I have a hopefully simple question.
I've converted a basic nautilus cluster over to cephadm+orchestration
and I tried adding, then removing a monitor. However, when I removed the
host using 'ceph orch host rm', it removed two mons. I may have missed
something in the adoption/upgrade that has left the cluster in a bad
state. Any advice/pointers/clarification would be of assistance.
Details:
A nautilus cluster with two mons (I know this is not correct for
quorum), a mgr, and a handful of osds. I went though the adoption
process and enabled the ceph orch backend.
[root@osdev-ctrl2 ~]# ceph orch ps
NAME HOST STATUS REFRESHED
AGE VERSION IMAGE NAME IMAGE ID CONTAINER ID
mgr.osdev-ctrl2 osdev-ctrl2 running (18h) 50s ago 18h 15.2.10
docker.io/ceph/ceph:v15.2.10 5b724076c58f e73c19b51a09
mon.osdev-ctrl2 osdev-ctrl2 running (18h) 50s ago 18h 15.2.10
docker.io/ceph/ceph:v15.2.10 5b724076c58f a6bfc27221f0
mon.osdev-net1 osdev-net1 running (18h) 50s ago 18h 15.2.10
docker.io/ceph/ceph:v15.2.10 5b724076c58f f66e2bef3d44
osd.0 osdev-stor1 running (17h) 50s ago 17h
15.2.10 docker.io/ceph/ceph:v15.2.10 5b724076c58f ac59dbdc267c
...
[root@osdev-ctrl2 ~]# ceph orch status
Backend: cephadm
Available: True
[root@osdev-ctrl2 ~]# ceph orch host ls
HOST ADDR LABELS STATUS
osdev-ctrl2 osdev-ctrl2 mon mgr
osdev-net1 osdev-net1 mon
osdev-stor1 osdev-stor1 osd
[root@osdev-ctrl2 ~]# ceph orch ls
NAME RUNNING REFRESHED AGE PLACEMENT IMAGE NAME
IMAGE ID
mgr 1/1 9m ago 20h label:mgr docker.io/ceph/ceph:v15.2.10
5b724076c58f
mon 2/2 9m ago 20h label:mon docker.io/ceph/ceph:v15.2.10
5b724076c58f
I then added a new mon host:
[root@osdev-ctrl2 ~]# ceph orch host add osdev-ctrl3 mon
It did not spawn a mon container on osdev-ctrl3 until I defined the
public network in the config:
[root@osdev-ctrl2 ~]# ceph config set global public_network 10.10.10.0/24
At this point all is good with three running mon as expected. Now I
wanted to delete the mon using
[root@osdev-ctrl2 ~]# ceph orch host rm osdev-ctrl3
This had the effect of:
1. removing the osdev-ctrl3 mon from 'ceph orch ls' and 'ceph orch ps'
2. the mon on osdev-ctrl3 is still running, and is part of 'ceph
-s' but reported as not managed by cephadm
3. (Big issue) the mon running on osdev-net1 was completely destroyed.
Any ideas what is going on? Sorry for the long post, but I tried to be
as clear as possible.
--
Gary Molenkamp Computer Science/Science Technology Services
Systems Administrator University of Western Ontario
molenkam@xxxxxx http://www.csd.uwo.ca
(519) 661-2111 x86882 (519) 661-3566
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx