Hi Eugen, once again thank you for your time. One of the servers has 2 OSD (osd.0 and osd.7) osd.0 uses the disk /dev/sdb as the "data" disk and part of the /dev/sdd (nvme) osd.7 uses the disk /dev/sdc as the "data" disk and part of the /dev/sdd (nvme) Regarding block.db /dev/sdc got replaced (hw failure) and I deleted osd.7 The server kept only one osd (osd.0) I ran the following command: cephadm ceph-volume lvm batch --report /dev/sdb /dev/sdc --db-devices /dev/sdd and it returned: Using recent ceph image docker.io/ceph/ceph:v15 /usr/bin/podman:stderr WARNING: The same type, major and minor should not be used for multiple devices. /usr/bin/podman:stderr WARNING: The same type, major and minor should not be used for multiple devices. /usr/bin/podman:stdout /usr/bin/podman:stdout Filtered Devices: /usr/bin/podman:stdout <Raw Device: /dev/sdb> /usr/bin/podman:stdout Used by ceph already /usr/bin/podman:stdout /usr/bin/podman:stdout Total OSDs: 1 /usr/bin/podman:stdout /usr/bin/podman:stdout Type Path LV Size % of device /usr/bin/podman:stdout ---------------------------------------------------------------------------------------------------- /usr/bin/podman:stdout [data] /dev/sdc 19.00 GB 100.0% Filtered Devices: <Raw Device: /dev/sdb> Used by ceph already Total OSDs: 1 Type Path LV Size % of device ---------------------------------------------------------------------------------------------------- [data] /dev/sdc 19.00 GB 100.0% I see it's not using /dev/sdd because it's not empty I also checked which vg/lv osd.7 was using, with: cephadm ceph-volume lvm list And returned: ====== osd.7 ======= [db] /dev/ceph-block-dbs-8b159f55-2500-427f-9743-2bb8b3df3e17/osd-block-db-b1e2a81f-2fc9-4786-85d2-6a27430d9f2e block device /dev/ceph-block-7c6ddde4-0314-43ab-8d00-a5b80cc8f74f/osd-block-95e812e9-f420-45b3-9f18-1e2bc6d3ff62 block uuid 0IAPL7-dJUa-RJH2-9pk2-nv4w-IaE3-rSPBIO cephx lockbox secret cluster fsid cc564efa-db19-494e-83b8-68b1d54724b6 cluster name ceph crush device class None db device /dev/ceph-block-dbs-8b159f55-2500-427f-9743-2bb8b3df3e17/osd-block-db-b1e2a81f-2fc9-4786-85d2-6a27430d9f2e db uuid TWvHrS-F9jx-0lld-Rlup-1rXk-Zu9q-A5H63W encrypted 0 osd fsid f22b9ec4-d63e-4667-b275-17018a609729 osd id 7 osdspec affinity drive_group_nubceph04 type db vdo 0 devices /dev/sdd I zapped that vg/lv with ceph orch device zap nubceph04 /dev/ceph-block-dbs-8b159f55-2500-427f-9743-2bb8b3df3e17/osd-block-db-b1e2a81f-2fc9-4786-85d2-6a27430d9f2e --force to leave /dev/sdc with space, so as to create another lvm for the osd missing, and ran this command again: cephadm ceph-volume lvm batch --report /dev/sdb /dev/sdc --db-devices /dev/sdd and shows, once again: Using recent ceph image docker.io/ceph/ceph:v15 /usr/bin/podman:stderr WARNING: The same type, major and minor should not be used for multiple devices. /usr/bin/podman:stderr WARNING: The same type, major and minor should not be used for multiple devices. /usr/bin/podman:stdout /usr/bin/podman:stdout Filtered Devices: /usr/bin/podman:stdout <Raw Device: /dev/sdb> /usr/bin/podman:stdout Used by ceph already /usr/bin/podman:stdout /usr/bin/podman:stdout Total OSDs: 1 /usr/bin/podman:stdout /usr/bin/podman:stdout Type Path LV Size % of device /usr/bin/podman:stdout ---------------------------------------------------------------------------------------------------- /usr/bin/podman:stdout [data] /dev/sdc 19.00 GB 100.0% Filtered Devices: <Raw Device: /dev/sdb> Used by ceph already Total OSDs: 1 Type Path LV Size % of device ---------------------------------------------------------------------------------------------------- [data] /dev/sdc 19.00 GB 100.0% I deleted the osd remaining on the server and cleaned all the disks Then, I ran the command you told me: (remember /dev/sdb and /dev/sdc for data, and /dev/sdd the nvme) cephadm ceph-volume lvm batch --report /dev/sdb /dev/sdc --db-devices /dev/sdd Returned this: Using recent ceph image docker.io/ceph/ceph:v15 /usr/bin/podman:stdout /usr/bin/podman:stdout Total OSDs: 2 /usr/bin/podman:stdout /usr/bin/podman:stdout Solid State VG: /usr/bin/podman:stdout Targets: block.db Total size: 9.00 GB /usr/bin/podman:stdout Total LVs: 2 Size per LV: 4.50 GB /usr/bin/podman:stdout Devices: /dev/sdd /usr/bin/podman:stdout /usr/bin/podman:stdout Type Path LV Size % of device /usr/bin/podman:stdout ---------------------------------------------------------------------------------------------------- /usr/bin/podman:stdout [data] /dev/sdb 19.00 GB 100.0% /usr/bin/podman:stdout [block.db] vg: vg/lv 4.50 GB 50% /usr/bin/podman:stdout ---------------------------------------------------------------------------------------------------- /usr/bin/podman:stdout [data] /dev/sdc 19.00 GB 100.0% /usr/bin/podman:stdout [block.db] vg: vg/lv 4.50 GB 50% Total OSDs: 2 Solid State VG: Targets: block.db Total size: 9.00 GB Total LVs: 2 Size per LV: 4.50 GB Devices: /dev/sdd Type Path LV Size % of device ---------------------------------------------------------------------------------------------------- [data] /dev/sdb 19.00 GB 100.0% [block.db] vg: vg/lv 4.50 GB 50% ---------------------------------------------------------------------------------------------------- [data] /dev/sdc 19.00 GB 100.0% [block.db] vg: vg/lv 4.50 GB 50% My conclusion (I may be wrong) is that: -if the disk where the block.db will be placed is not 100% empty, then it does not use it. My question, then, is: Is there a way to recreate/create a new osd that has a block.db in a disk which is not completely empty? Answering your questions one by one 1. Can you check what ceph-volume would do if you did it manually? All answered above 2. One more question, did you properly wipe the previous LV on that NVMe? If I'm not mistaken, I did exactly that when I ran the command: ceph orch device zap nubceph04 /dev/ceph-block-dbs-8b159f55-2500-427f-9743-2bb8b3df3e17/osd-block-db-b1e2a81f-2fc9-4786-85d2-6a27430d9f2e --force 3. You should also have some logs available from the deployment attempt, maybe it reveals why the NVMe was not considered. I couldn't find any relevant logs regarding this question. Best regards, Eric -----Original Message----- From: Eugen Block [mailto:eblock@xxxxxx] Sent: 24 August 2021 05:07 To: ceph-users@xxxxxxx Subject: Re: Missing OSD in SSD after disk failure Can you check what ceph-volume would do if you did it manually? Something like this host1:~ # cephadm ceph-volume lvm batch --report /dev/vdc /dev/vdd --db-devices /dev/vdb and don't forget the '--report' flag. One more question, did you properly wipe the previous LV on that NVMe? You should also have some logs available from the deployment attempt, maybe it reveals why the NVMe was not considered. Zitat von Eric Fahnle <efahnle@xxxxxxxxxxx>: > Hi Eugen, thanks for the reply. > > I've already tried what you wrote in your answer, but still no luck. > > The NVMe disk still doesn't have the OSD. Please note I using > containers, not standalone OSDs. > > Any ideas? > > Regards, > Eric > > ________________________________ > Message: 2 > Date: Fri, 20 Aug 2021 06:56:59 +0000 > From: Eugen Block <eblock@xxxxxx> > Subject: Re: Missing OSD in SSD after disk failure > To: ceph-users@xxxxxxx > Message-ID: > <20210820065659.Horde.Azw9eV10u5ynqKwJpUyrg6_@xxxxxxxxxxxxxx> > Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes > > Hi, > > this seems to be a reoccuring issue, I had the same just yesterday in > my lab environment running on 15.2.13. If I don't specify other > criteria in the yaml file then I'll end up with standalone OSDs > instead of the desired rocksDB on SSD. Maybe this is still a bug, I > didn't check. My workaround is this spec file: > > ---snip--- > block_db_size: 4G > data_devices: > size: "20G:" > rotational: 1 > db_devices: > size: "10G" > rotational: 0 > filter_logic: AND > placement: > hosts: > - host4 > - host3 > - host1 > - host2 > service_id: default > service_type: osd > ---snip--- > > If you apply the new spec file, then destroy and zap the standalone > OSD I believe the orchestrator should redeploy it correctly, it did in > my case. But as I said, this is just a small lab environment. > > Regards, > Eugen > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an > email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx