Re: ceph orch host rm seems to just move daemons out of cephadm, not remove them

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



tried removing the daemon first, and that kinda blew up.

ceph orch daemon rm --force mon.tempmon
ceph orch host rm tempmon

now there are two problems.
1. ceph is still looking for it,

  services:
    mon: 4 daemons, quorum ceph1,ceph2,ceph3 (age 3s), out of quorum:
tempmon
    mgr: ceph1.oqptlg(active, since 9h), standbys: ceph2.hezrvv
    osd: 9 osds: 9 up (since 8h), 9 in (since 8h)

2. more worrying the cephadm module is failing somewhere
INFO:cephadm:Inferring fsid 09e9711e-db88-11ea-b8c2-791b9888d2f2
INFO:cephadm:Using recent ceph image ceph/ceph:v15
  cluster:
    id:     09e9711e-db88-11ea-b8c2-791b9888d2f2
    health: HEALTH_ERR
            Module 'cephadm' has failed: must be str, not NoneType
            1/4 mons down, quorum ceph1,ceph2,ceph3

root@ceph1:/home/vagrant# ceph health detail
INFO:cephadm:Inferring fsid 09e9711e-db88-11ea-b8c2-791b9888d2f2
INFO:cephadm:Using recent ceph image ceph/ceph:v15
HEALTH_ERR Module 'cephadm' has failed: must be str, not NoneType; 1/4 mons
down, quorum ceph1,ceph2,ceph3
[ERR] MGR_MODULE_ERROR: Module 'cephadm' has failed: must be str, not
NoneType
    Module 'cephadm' has failed: must be str, not NoneType
[WRN] MON_DOWN: 1/4 mons down, quorum ceph1,ceph2,ceph3
    mon.tempmon (rank 3) addr [v2:10.16.16.10:3300/0,v1:10.16.16.10:6789/0]
is down (out of quorum)

is there another step that should be taken? Id expect "ceph orch host rm"
to also take anything it managed out of the cluster.

On Mon, Aug 10, 2020 at 12:42 PM pixel fairy <pixelfairy@xxxxxxxxx> wrote:

> made a cluster of 2 osd hosts, and one temp monitor. then added another
> osd host and did a "ceph orch host rm tempmon". this all in vagrant
> (libvirt), with the generic/ubuntu2004 box.
>
> INFO:cephadm:Inferring fsid 5426a59e-db33-11ea-8441-b913b695959d
> INFO:cephadm:Using recent ceph image ceph/ceph:v15
>   cluster:
>     id:     5426a59e-db33-11ea-8441-b913b695959d
>     health: HEALTH_WARN
>             2 stray daemons(s) not managed by cephadm
>             1 stray host(s) with 2 daemon(s) not managed by cephadm
>
> added 2 more osd hosts, and ceph -s gave me this,
>   services:
>     mon: 6 daemons, quorum ceph5,ceph4,tempmon,ceph3,ceph2,ceph1 (age 33m)
>     mgr: ceph5.erdofb(active, since 82m), standbys: tempmon.xkrlmm,
> ceph3.xjuecs
>     osd: 15 osds: 15 up (since 33m), 15 in (since 33m)
>
> my guess is cephadm wanted 5 managed mons, so did that, but still never
> removed the removed mon on the removed host. its still up. this is just a
> vagrant file. so i have two questions.
>
> 1. how do you remove that other host and its daemons from the cluster?
> 2. how would you recover from a host being destroyed?
>
> p.s. tried google
> Your search - "ceph orch host rm" "stray daemons(s) not manage by
> cephadm" - did not match any documents.
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux