Cephadm: How to remove a stray daemon ghost

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

I have a warning that says
"1 stray daemon(s) not managed by cephadm"

What i did is the following.
I have 3 nodes that the mon should run on, but because of a bug in 16.2.4 I couldn't run on then since they are in different subnet.
But this was fixed in 16.2.5 so i upgraded without issues.

Before I started it looked like this

root@pech-mon-1:~# ceph orch ps | grep ^mon
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID mon.pech-cog-1 pech-cog-1 running (23h) 9m ago 3w 1182M 2048M 16.2.5 6933c2a0b7dd b226c1714777 mon.pech-mds-1 pech-mds-1 running (23h) 7m ago 3w 1147M 2048M 16.2.5 6933c2a0b7dd 40f8e268afca mon.pech-mon-1 pech-mon-1 running (23h) 2m ago 3w 1161M 2048M 16.2.5 6933c2a0b7dd b358057dcb3a


To place the daemon on correct hosts I run this
root@pech-mon-1:~# ceph orch apply mon pech-mon-1,pech-mon-2,pech-mon-3
Scheduled mon update...


And that worked fine.
root@pech-mon-1:~# ceph orch ps |grep ^mon
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID mon.pech-mon-1 pech-mon-1 running (23h) 6s ago 3w 1360M 2048M 16.2.5 6933c2a0b7dd b358057dcb3a mon.pech-mon-2 pech-mon-2 running (13s) 6s ago 13s 287M 2048M 16.2.5 6933c2a0b7dd 25a68933c119 mon.pech-mon-3 pech-mon-3 running (11s) 6s ago 11s 241M 2048M 16.2.5 6933c2a0b7dd be0c6e5a5fdf


but i got a health warning
root@pech-mon-1:~# ceph health detail
HEALTH_WARN 1 stray daemon(s) not managed by cephadm
[WRN] CEPHADM_STRAY_DAEMON: 1 stray daemon(s) not managed by cephadm
stray daemon mon.pech-mds-1 on host pech-cog-1 not managed by cephadm

The strange thing is daemon mon.pech-mds-1 has never run on pech-cog-1.
And the problem is that I can not find this supposedly stray damon.


With ansible I run "podman ps" on all nodes and removed the osd, node and crash damone from the output

$ ansible pech -u root -m shell -a "podman ps" | grep ceph | awk '{ print $NF }' | egrep -v "osd|node|crash" | sort
ceph-<fsid>-alertmanager.pech-mds-1
ceph-<fsid>-grafana.pech-cog-2
ceph-<fsid>-mgr.pech-mon-1.ptrsea
ceph-<fsid>-mgr.pech-mon-2.mfdanx
ceph-<fsid>-mon.pech-mon-1
ceph-<fsid>-mon.pech-mon-2
ceph-<fsid>-mon.pech-mon-3
ceph-<fsid>-prometheus.pech-mds-1

No stray daemon here


also with ansible I run "cephadm ls" on all of them and removed the osd, node and crash damone from the output

$ ansible pech -u root -m shell -a "cephadm ls | jq .[].name" | grep '^"' | egrep -v "osd|node|crash" | sort
"alertmanager.pech-mds-1"
"grafana.pech-cog-2"
"mgr.pech-mon-1.ptrsea"
"mgr.pech-mon-2.mfdanx"
"mon.pech-mon-1"
"mon.pech-mon-2"
"mon.pech-mon-3"
"prometheus.pech-mds-1"

No stray daemon here either.

Does anyone know how to find this supposedly stray daemon?


--
Kai Stian Olstad
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux