Hi Alison, I have observed exactly that with OSDs "converted" from ceph-disk to ceph-volume. Someone thought it would be a great idea to store the /dev-device name in the config instead of the uuid or any other stable device path: # cat /etc/ceph/osd/287-2eaf591b-bced-4097-9499-5fda071c6161.json { ... "block": { "path": "/dev/disk/by-partuuid/0c8a9f89-efa7-4c75-87ad-2f0d5aa2d649", "uuid": "0c8a9f89-efa7-4c75-87ad-2f0d5aa2d649" }, ... "data": { "path": "/dev/sdm1", "uuid": "2eaf591b-bced-4097-9499-5fda071c6161" }, ... } Funnily enough, it has the by-uuid path stored as well, but the /dev path is actually used during activation. My "fix" is to re-generate the OSD-json just before every ceph-disk OSD start. You seem to be using LVM OSDs already, so this is a bit weird (can't be the exact same issue). Still, I would not be surprised if you are bitten by something similar, some stored config (cache) overrides the actual drive location. It is really a bliss that the developers implemented a check that a partition actually points to the data with the correct OSD ID, otherwise our cluster would be rigged by now. I would start by using low-level commands (ceph-volume) directly to see if the issue is low-level or sits in some higher-level interface. Log-in to the OSD node and check what "ceph-volume inventory" says and if you can manually activate/deactivate the OSD on disk (be careful to include the --no-systemd option everywhere to avoid unintended change of persistent configurations). Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: apeisker@xxxxxxxx <apeisker@xxxxxxxx> Sent: Friday, August 25, 2023 10:29 PM To: ceph-users@xxxxxxx Subject: Re: A couple OSDs not starting after host reboot Hi, Thank you for your reply. I don’t think the device names changed, but ceph seems to be confused about which device the OSD is on. It’s reporting that there are 2 OSDs on the same device although this is not true. ceph device ls-by-host <osd-node> | grep sdu ATA_HGST_HUH728080ALN600_VJH4GLUX sdu osd.665 ATA_HGST_HUH728080ALN600_VJH60MAX sdu osd.657 The osd.665 is actually on device sdm. Could this be the cause of the issue? Is there a way to correct it? Thanks, Alison _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx