Hi David, no problem, thanks for your help! Went through your commands, here are the results -4 Servers with OSDs -Server "nubceph04" has 2 osd (osd.0 and osd.7 in /dev/sdb and /dev/sdc respectively, and db_device in /dev/sdd) # capture "db device" and raw device associated with OSD (just for safety) "ceph-volume lvm list" shows for each osd, which disks and lvs are in use (snipped): ====== osd.0 ======= [block] /dev/ceph-block-b301ec31-5779-4834-9fb7-e45afa45f803/osd-block-79d89e54-4a4b-4e89-aea3-72fa6aa343a5 db device /dev/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54/osd-block-db-e7771b96-7a1d-43b2-a7d8-9204ef158224 osd id 0 devices /dev/sdb [db] /dev/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54/osd-block-db-e7771b96-7a1d-43b2-a7d8-9204ef158224 block device /dev/ceph-block-b301ec31-5779-4834-9fb7-e45afa45f803/osd-block-79d89e54-4a4b-4e89-aea3-72fa6aa343a5 db device /dev/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54/osd-block-db-e7771b96-7a1d-43b2-a7d8-9204ef158224 osd id 0 devices /dev/sdd ====== osd.7 ======= [block] /dev/ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3/osd-block-42278e28-5274-4167-a014-6a6a956110ad block device /dev/ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3/osd-block-42278e28-5274-4167-a014-6a6a956110ad osd id 7 devices /dev/sdc [db] /dev/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54/osd-block-db-fd2bd125-3f22-40f1-8524-744a100236f3 block device /dev/ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3/osd-block-42278e28-5274-4167-a014-6a6a956110ad db device /dev/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54/osd-block-db-fd2bd125-3f22-40f1-8524-744a100236f3 osd id 7 devices /dev/sdd # drain drive if possible, do this when planning replacement, otherwise do once failure has occurred #Try to remove the osd.7 ceph orch osd rm 7 --replace Scheduled OSD(s) for removal Waited until it finished rebalancing, monitoring with: ceph -W cephadm 2021-08-30T18:05:32.280716-0300 mgr.nubvm02.viqmmr [INF] OSD <7> is not empty yet. Waiting a bit more 2021-08-30T18:06:03.374424-0300 mgr.nubvm02.viqmmr [INF] OSDs <[<OSD>(osd_id=7, is_draining=False)]> are now <down> # One drained (or if failure occurred) (we don't use the orch version #yet because we've had issues with it) ceph-volume lvm zap --osd-id 7 --destroy --> Zapping: /dev/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54/osd-block-db-fd2bd125-3f22-40f1-8524-744a100236f3 Running command: /usr/bin/dd if=/dev/zero of=/dev/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54/osd-block-db-fd2bd125-3f22-40f1-8524-744a100236f3 bs=1M count=10 conv=fsync stderr: 10+0 records in 10+0 records out 10485760 bytes (10 MB, 10 MiB) copied, 0.0853454 s, 123 MB/s --> More than 1 LV left in VG, will proceed to destroy LV only --> Removing LV because --destroy was given: /dev/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54/osd-block-db-fd2bd125-3f22-40f1-8524-744a100236f3 Running command: /usr/sbin/lvremove -v -f /dev/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54/osd-block-db-fd2bd125-3f22-40f1-8524-744a100236f3 stdout: Logical volume "osd-block-db-fd2bd125-3f22-40f1-8524-744a100236f3" successfully removed stderr: Removing ceph--block--dbs--08ee3a44--8503--40dd--9bdd--ed9a8f674a54-osd--block--db--fd2bd125--3f22--40f1--8524--744a100236f3 (253:3) stderr: Archiving volume group "ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54" metadata (seqno 9). stderr: Releasing logical volume "osd-block-db-fd2bd125-3f22-40f1-8524-744a100236f3" stderr: Creating volume group backup "/etc/lvm/backup/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54" (seqno 10). --> Zapping: /dev/ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3/osd-block-42278e28-5274-4167-a014-6a6a956110ad Running command: /usr/bin/dd if=/dev/zero of=/dev/ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3/osd-block-42278e28-5274-4167-a014-6a6a956110ad bs=1M count=10 conv=fsync stderr: 10+0 records in 10+0 records out 10485760 bytes (10 MB, 10 MiB) copied, 0.054587 s, 192 MB/s --> Only 1 LV left in VG, will proceed to destroy volume group ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3 Running command: /usr/sbin/vgremove -v -f ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3 stderr: Removing ceph--block--c3d30e81--ff7d--4007--9ad4--c16f852466a3-osd--block--42278e28--5274--4167--a014--6a6a956110ad (253:2) stderr: Archiving volume group "ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3" metadata (seqno 5). Releasing logical volume "osd-block-42278e28-5274-4167-a014-6a6a956110ad" stderr: Creating volume group backup "/etc/lvm/backup/ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3" (seqno 6). stdout: Logical volume "osd-block-42278e28-5274-4167-a014-6a6a956110ad" successfully removed stderr: Removing physical volume "/dev/sdc" from volume group "ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3" stdout: Volume group "ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3" successfully removed --> Zapping successful for OSD: 7 after that, the command: ceph-volume lvm list shows on osd.0 the same as above, but nothing about osd.7 # refresh devices ceph orch device ls --refresh HOST PATH TYPE SIZE DEVICE_ID MODEL VENDOR ROTATIONAL AVAIL REJECT REASONS nubceph04 /dev/sda hdd 19.0G Virtual disk VMware 1 False locked nubceph04 /dev/sdb hdd 20.0G Virtual disk VMware 1 False locked, Insufficient space (<5GB) on vgs, LVM detected nubceph04 /dev/sdc hdd 20.0G Virtual disk VMware 1 True nubceph04 /dev/sdd hdd 10.0G Virtual disk VMware 1 False locked, LVM detected After some time, recreates osd.7 but without db_device # monitor ceph for replacement ceph -W cephadm .. 2021-08-30T18:11:22.439190-0300 mgr.nubvm02.viqmmr [INF] Deploying daemon osd.7 on nubceph04 .. Wait until it finishes rebalancing. If I run again: ceph-volume lvm list shows for each osd, which disks and lvs are in use (snipped): ====== osd.0 ======= [block] /dev/ceph-block-b301ec31-5779-4834-9fb7-e45afa45f803/osd-block-79d89e54-4a4b-4e89-aea3-72fa6aa343a5 db device /dev/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54/osd-block-db-e7771b96-7a1d-43b2-a7d8-9204ef158224 osd id 0 devices /dev/sdb [db] /dev/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54/osd-block-db-e7771b96-7a1d-43b2-a7d8-9204ef158224 block device /dev/ceph-block-b301ec31-5779-4834-9fb7-e45afa45f803/osd-block-79d89e54-4a4b-4e89-aea3-72fa6aa343a5 db device /dev/ceph-block-dbs-08ee3a44-8503-40dd-9bdd-ed9a8f674a54/osd-block-db-e7771b96-7a1d-43b2-a7d8-9204ef158224 osd id 0 devices /dev/sdd ====== osd.7 ======= [block] /dev/ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3/osd-block-42278e28-5274-4167-a014-6a6a956110ad block device /dev/ceph-block-c3d30e81-ff7d-4007-9ad4-c16f852466a3/osd-block-42278e28-5274-4167-a014-6a6a956110ad osd id 7 devices /dev/sdc It seems it didn't create the lv for "ceph-block-dbs" as it had before If I run everything again but with osd.0, it creates correctly, because when running: ceph-volume lvm zap --osd-id 0 --destroy It doesn't say this line: --> More than 1 LV left in VG, will proceed to destroy LV only But it rather says this: --> Only 1 LV left in VG, will proceed to destroy volume group As far as I can tell, if the disk is not empty it just doesn't use it. Let me know if I wasn't clear enough. Best regards, Eric ________________________________ From: David Orman <ormandj@xxxxxxxxxxxx> Sent: Monday, August 30, 2021 1:14 PM To: Eric Fahnle <efahnle@xxxxxxxxxxx> Cc: ceph-users@xxxxxxx <ceph-users@xxxxxxx> Subject: Re: Missing OSD in SSD after disk failure I may have misread your original email, for which I apologize. If you do a 'ceph orch device ls' does the NVME in question show available? On that host with the failed OSD, if you lvs/lsblk do you see the old DB on the NVME still? I'm not sure if the replacement process you followed will work. Here's what we do on OSD pre-failure as well as failures on nodes with NVME backing the OSD for DB/WAL: In cephadm shell, on host with drive to replace (in this example, let's say 391 on a node called ceph15): # capture "db device" and raw device associated with OSD (just for safety) ceph-volume lvm list | less # drain drive if possible, do this when planning replacement, otherwise do once failure has occurred ceph orch osd rm 391 --replace # One drained (or if failure occurred) (we don't use the orch version yet because we've had issues with it) ceph-volume lvm zap --osd-id 391 --destroy # refresh devices ceph orch device ls --refresh # monitor ceph for replacement ceph -W cephadm # once daemon has been deployed "2021-03-25T18:03:16.742483+0000 mgr.ceph02.duoetc [INF] Deploying daemon osd.391 on ceph15", watch for rebalance to complete ceph -s # consider increasing max_backfills if it's just a single drive replacement: ceph config set osd osd_max_backfills 10 # if you do, after backfilling is complete (validate with 'ceph -s'): ceph config rm osd osd_max_backfills The lvm zap cleans up the db/wal LV, which allows for the replacement drive to rebuild with db/wal on the NVME. Hope this helps, David On Fri, Aug 27, 2021 at 7:21 PM Eric Fahnle <efahnle@xxxxxxxxxxx> wrote: > > Hi David! Very much appreciated your response. > > I'm not sure that may be the problem. I tried with the following (without using "rotational"): > > ...(snip)... > data_devices: > size: "15G:" > db_devices: > size: ":15G" > filter_logic: AND > placement: > label: "osdj2" > service_id: test_db_device > service_type: osd > ...(snip)... > > Without success. Also tried without the "filter_logic: AND" in the yaml file and the result was the same. > > Best regards, > Eric > > > -----Original Message----- > From: David Orman [mailto:ormandj@xxxxxxxxxxxx] > Sent: 27 August 2021 14:56 > To: Eric Fahnle > Cc: ceph-users@xxxxxxx > Subject: Re: Missing OSD in SSD after disk failure > > This was a bug in some versions of ceph, which has been fixed: > > https://tracker.ceph.com/issues/49014 > https://github.com/ceph/ceph/pull/39083 > > You'll want to upgrade Ceph to resolve this behavior, or you can use size or something else to filter if that is not possible. > > David > > On Thu, Aug 19, 2021 at 9:12 AM Eric Fahnle <efahnle@xxxxxxxxxxx> wrote: > > > > Hi everyone! > > I've got a doubt, I tried searching for it in this list, but didn't find an answer. > > > > I've got 4 OSD servers. Each server has 4 HDDs and 1 NVMe SSD disk. The deployment was done with "ceph orch apply deploy-osd.yaml", in which the file "deploy-osd.yaml" contained the following: > > --- > > service_type: osd > > service_id: default_drive_group > > placement: > > label: "osd" > > data_devices: > > rotational: 1 > > db_devices: > > rotational: 0 > > > > After the deployment, each HDD had an OSD and the NVMe shared the 4 OSDs, plus the DB. > > > > A few days ago, an HDD broke and got replaced. Ceph detected the new disk and created a new OSD for the HDD but didn't use the NVMe. Now the NVMe in that server has 3 OSDs running but didn't add the new one. I couldn't find out how to re-create the OSD with the exact configuration it had before. The only "way" I found was to delete all 4 OSDs and create everything from scratch (I didn't actually do it, as I hope there is a better way). > > > > Has anyone had this issue before? I'd be glad if someone pointed me in the right direction. > > > > Currently running: > > Version > > 15.2.8 > > octopus (stable) > > > > Thank you in advance and best regards, Eric > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an > > email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx