Incidentally, I just noticed that my phantom host isn't completely gone. It's not in the host list, either command-line or dashboard, but it does list (with no assets) as a host under "ceph osd tree". --- More seriously, I've been having problems with OSDs that report as being both up and down at the same time. This is on 2 new hosts. One host saw this when I made it the _admin host. The other caught it because it's running in a VM with the OSD mapped out as an imported disk and the host os managed to flip which drive was sda and which was sbd, resulting in having to delete and re- define the OSD in the VM. But now the OSD on this VM reports as "UP/IN" on the dashboard while it's "error" on "ceph orch ps" and on the actual vbox, the osd container fails on startup. viz: ul 12 20:06:48 ceph05.internal.mousetech.com ceph-278fcd86-0861-11ee- a7df-9c5c8e86cf8f-osd-4[4017]: debug 2024-07-12T20:06:48.056+0000 7fc17dfb9380 -1 bdev(0x55e4853c4800 /var/lib/ceph/osd/ceph-4/block) open open got: (16) De> Jul 12 20:06:48 ceph05.internal.mousetech.com ceph-278fcd86-0861-11ee- a7df-9c5c8e86cf8f-osd-4[4017]: debug 2024-07-12T20:06:48.056+0000 7fc17dfb9380 -1 osd.4 0 OSD:init: unable to mount object store Jul 12 20:06:48 ceph05.internal.mousetech.com ceph-278fcd86-0861-11ee- a7df-9c5c8e86cf8f-osd-4[4017]: debug 2024-07-12T20:06:48.056+0000 7fc17dfb9380 -1 ** ERROR: osd init failed: (16) Device or resource busy Note that truncated message above reads bdev(0x55e4853c4800 /var/lib/ceph/osd/ceph-4/block) open open got: (16) Device or resource busy Rebooting doesn't help. Nor does freeing us resources and stopping/starting processes manually. The problem eventually cleared up spontaneously on the admin box, but I have no idea why. --- Also noted that now the OSD on the admin box shows in ceph orch ps as "stopped", though again, the dashboard lists it as "UP/IN". Here's what systemctl thinks about it: systemctl status ceph-osd@5.service ● ceph-osd@5.service - Ceph object storage daemon osd.5 Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled-runtime; preset: disabled) Active: active (running) since Fri 2024-07-12 16:45:51 EDT; 1min 40s ago Process: 8511 ExecStartPre=/usr/libexec/ceph/ceph-osd-prestart.sh - -cluster ${CLUSTER} --id 5 (code=exited, status=0/SUCCESS) Main PID: 8517 (ceph-osd) Tasks: 70 Memory: 478.6M CPU: 3.405s CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@5.service └─8517 /usr/bin/ceph-osd -f --cluster ceph --id 5 -- setuser ceph --setgroup ceph Jul 12 16:45:51 dell02.mousetech.com systemd[1]: Starting Ceph object storage daemon osd.5... Jul 12 16:45:51 dell02.mousetech.com systemd[1]: Started Ceph object storage daemon osd.5. Jul 12 16:45:51 dell02.mousetech.com ceph-osd[8517]: 2024-07- 12T16:45:51.642-0400 7f2bd440c140 -1 Falling back to public interface Jul 12 16:45:58 dell02.mousetech.com ceph-osd[8517]: 2024-07- 12T16:45:58.352-0400 7f2bd440c140 -1 osd.5 34161 log_to_monitors {default=true} Jul 12 16:45:59 dell02.mousetech.com ceph-osd[8517]: 2024-07- 12T16:45:59.206-0400 7f2bcbbf0640 -1 osd.5 34161 set_numa_affinity unable to identify public interface '' numa node: (2) No such file or directory The actual container is not running. Ceph version, incidentally, is 16.2.15. Except for that one node that apparently didn't move up from Octupus (I'll be nuking that one shortly). _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx