Hi,
When things settle down, I *MIGHT* put in a RFE to change the
default for ceph-volume to --no-systemd to save someone else from
this anguish.
note that there are still users/operators/admins who don't use
containers. Changing the ceph-volume default might not be the best
idea in this case.
Regarding the cleanup, this was the thread [1] Tim was referring to. I
would set the noout flag, stop an OSD (so the device won't be busy
anymore), make sure that both ceph-osd@{OSD_ID} and
ceph-{FSID}@osd.{OSD_ID} then double check that everything you need is
still under /var/lib/ceph/{FSID}/osd.{OSD_ID}, like configs an
keyrings. Disable the ceph-osd@{OSD_ID} (as already pointed out), then
check if the orchestrator can start the OSD via systemd:
ceph orch daemon start osd.{OSD_ID}
or alternatively, try it manually:
systemctl reset-failed
systemctl start ceph-{FSID}@osd.{OSD_ID}
Watch the log for that OSD to identify any issues. If it works, unset
the noout flag. You might want to ensure it also works after a reboot,
though.
I don't think it should be necessary to redeploy the OSDs, but the
cleanup has to be proper.
As a guidance you can check the cephadm tool's contents and look for
the "adopt" function. That migrates the contents of the pre-cephadm
daemons into the FSID specific directories.
Regards,
Eugen
[1]
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/message/K2R3MXRD3S2DSXCEGX5IPLCF5L3UUOQI/
Zitat von Dan O'Brien <dobrie2@xxxxxxx>:
OK... I've been in the Circle of Hell where systemd lives and I
*THINK* I have convinced myself I'm OK. I *REALLY* don't want to
trash and rebuild the OSDs.
In the manpage for systemd.unit, I found
UNIT GARBAGE COLLECTION
The system and service manager loads a unit's configuration
automatically when a unit is referenced for the first time. It will
automatically unload the unit configuration and state again when the
unit is not needed anymore ("garbage collection").
I've disabled the systemd units (which removes the symlink from the
target) for the non-cephadm OSDs I created by mistake and I'm PRETTY
SURE if I wait long enough (or reboot) that I won't see them any
more, since there won't be a unit for systemd to care about.
I *WILL* have to clean up /var/lib/ceph/osd eventually. I tried just
now, but it says "device busy." I think that's because there's some
OTHER systemd cruft that shows a mount:
[root@ceph02 ~]# systemctl --all | grep ceph | grep mount
var-lib-ceph-osd-ceph\x2d11.mount loaded active
mounted /var/lib/ceph/osd/ceph-11
var-lib-ceph-osd-ceph\x2d25.mount loaded active
mounted /var/lib/ceph/osd/ceph-25
var-lib-ceph-osd-ceph\x2d9.mount loaded active
mounted /var/lib/ceph/osd/ceph-9
When things settle down, I *MIGHT* put in a RFE to change the
default for ceph-volume to --no-systemd to save someone else from
this anguish.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx