Re: osd removal leaves 'stray daemon'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,
a mgr failover did not change the situation - the osd still shows up in the 'ceph node ls' - I assume that this is more or less 'working as intended' as I did ask for the OSD to be kept in the CRUSH map to be replacd later - but as we are still not so experienced with Ceph here I wanted to get some input from other sites.

Regards,
Holger

On 30.11.22 16:28, Adam King wrote:
I typically don't see this when I do OSD replacement. If you do a mgr failover ("ceph mgr fail") and wait a few minutes does this still show up? The stray daemon/host warning is roughly equivalent to comparing the daemons in `ceph node ls` and `ceph orch ps` and seeing if there's anything in the former but not the latter. Sometimes I have seen the mgr will have some out of data info in the node ls and a failover will refresh it.

On Fri, Nov 25, 2022 at 6:07 AM Holger Naundorf <naundorf@xxxxxxxxxxxxxx <mailto:naundorf@xxxxxxxxxxxxxx>> wrote:

    Hello,
    I have a question about osd removal/replacement:

    I just removed an osd where the disk was still running but had read
    errors, leading to failed deep scrubs - as the intent is to replace
    this
    as soon as we manage to get a spare I removed it with the
    '--replace' flag:

    # ceph orch osd rm 224 --replace

    After all placement groups are evacuated I now have 1 osd down/out
    and showing as 'destroyed':

    # ceph osd tree
    ID   CLASS  WEIGHT      TYPE NAME        STATUS     REWEIGHT  PRI-AFF
    (...)
    214    hdd    14.55269          osd.214         up   1.00000  1.00000
    224    hdd    14.55269          osd.224  destroyed         0  1.00000
    234    hdd    14.55269          osd.234         up   1.00000  1.00000
    (...)

    All as expected - but now the health check complains that the
    (destroyed) osd is not managed:

    # ceph health detail
    HEALTH_WARN 1 stray daemon(s) not managed by cephadm
    [WRN] CEPHADM_STRAY_DAEMON: 1 stray daemon(s) not managed by cephadm
          stray daemon osd.224 on host ceph19 not managed by cephadm

    Is this expected behaviour and I have to live with the yellow check
    until we get a replacement disk and recreate the osd or did something
    not finish correctly?

    Regards,
    Holger

-- Dr. Holger Naundorf
    Christian-Albrechts-Universität zu Kiel
    Rechenzentrum / HPC / Server und Storage
    Tel: +49 431 880-1990
    Fax:  +49 431 880-1523
    naundorf@xxxxxxxxxxxxxx <mailto:naundorf@xxxxxxxxxxxxxx>
    _______________________________________________
    ceph-users mailing list -- ceph-users@xxxxxxx
    <mailto:ceph-users@xxxxxxx>
    To unsubscribe send an email to ceph-users-leave@xxxxxxx
    <mailto:ceph-users-leave@xxxxxxx>


--
Dr. Holger Naundorf
Christian-Albrechts-Universität zu Kiel
Rechenzentrum / HPC / Server und Storage
Tel: +49 431 880-1990
Fax:  +49 431 880-1523
naundorf@xxxxxxxxxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux