ceph orch osd rm --zap --replace leaves cluster in odd state

Matthew Vernon <mvernon@xxxxxxxxxxxxx> · Tue, 28 May 2024 16:43:13 +0100

Hi,

I want to prepare a failed disk for replacement. I did:
ceph orch osd rm 35 --zap --replace

and it's now in the state "Done, waiting for purge", with 0 pgs, and 
REPLACE and ZAP set to true. It's been like this for some hours, and now 
my cluster is unhappy:

[WRN] CEPHADM_STRAY_DAEMON: 1 stray daemon(s) not managed by cephadm
    stray daemon osd.35 on host moss-be1002 not managed by cephadm

(the OSD is down & out)

...and also neither the disk nor the relevant NVME LV has been zapped.

I have my OSDs deployed via a spec:
service_type: osd
service_id: rrd_single_NVMe
placement:
  label: "NVMe"
spec:
  data_devices:
    rotational: 1
  db_devices:
    model: "NVMe"

And before issuing the ceph orch osd rm I set that to be unmanaged (ceph 
orch set-unmanaged osd.rrd_single_NVMe), as obviously I don't want ceph 
to just try and re-make a new OSD on the sad disk.

I'd expected from the docs[0] that what I did would leave me with a 
system ready for the failed disk to be swapped (and that I could then 
mark osd.rrd_single_NVMe as managed again, and a new OSD built), 
including removing/wiping the NVME lv so it can be removed.

What did I do wrong? I don't much care about the OSD id (but obviously 
it's neater to not just incrementally increase OSD numbers every time a 
disk died), but I thought that telling ceph orch not to make new OSDs 
then using ceph orch osd rm to zap the disk and NVME lv would have been 
the way to go...

Thanks,

Matthew

[0] https://docs.ceph.com/en/reef/cephadm/services/osd/#replacing-an-osd
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx