What does your OSD service specification look like? Did your db/wall device show as having free space prior to the OSD creation? On Tue, Jan 31, 2023, at 04:01, mailing-lists wrote: > OK, the OSD is filled again. In and Up, but it is not using the nvme > WAL/DB anymore. > > And it looks like the lvm group of the old osd is still on the nvme > drive. I come to this idea, because the two nvme drives still have 9 lvm > groups each. 18 groups but only 17 osd are using the nvme (shown in > dashboard). > > > Do you have a hint on how to fix this? > > > > Best > > Ken > > > > On 30.01.23 16:50, mailing-lists wrote: >> oph wait, >> >> i might have been too impatient: >> >> >> 1/30/23 4:43:07 PM[INF]Deploying daemon osd.232 on ceph-a1-06 >> >> 1/30/23 4:42:26 PM[INF]Found osd claims for drivegroup >> dashboard-admin-1661788934732 -> {'ceph-a1-06': ['232']} >> >> 1/30/23 4:42:26 PM[INF]Found osd claims -> {'ceph-a1-06': ['232']} >> >> 1/30/23 4:42:19 PM[INF]Found osd claims -> {'ceph-a1-06': ['232']} >> >> 1/30/23 4:41:01 PM[INF]Found osd claims for drivegroup >> dashboard-admin-1661788934732 -> {'ceph-a1-06': ['232']} >> >> 1/30/23 4:41:01 PM[INF]Found osd claims -> {'ceph-a1-06': ['232']} >> >> 1/30/23 4:41:01 PM[INF]Found osd claims -> {'ceph-a1-06': ['232']} >> >> 1/30/23 4:41:00 PM[INF]Found osd claims -> {'ceph-a1-06': ['232']} >> >> 1/30/23 4:39:34 PM[INF]Found osd claims for drivegroup >> dashboard-admin-1661788934732 -> {'ceph-a1-06': ['232']} >> >> 1/30/23 4:39:34 PM[INF]Found osd claims -> {'ceph-a1-06': ['232']} >> >> 1/30/23 4:39:34 PM[INF]Found osd claims -> {'ceph-a1-06': ['232']} >> >> >> >> Although, it doesnt show the NVME as wal/db yet, but i will let it >> proceed to a clear state until i do anything further. >> >> >> On 30.01.23 16:42, mailing-lists wrote: >>> root@ceph-a2-01:/# ceph osd destroy 232 --yes-i-really-mean-it >>> destroyed osd.232 >>> >>> >>> OSD 232 shows now as destroyed and out in the dashboard. >>> >>> >>> root@ceph-a1-06:/# ceph-volume lvm zap /dev/sdm >>> --> Zapping: /dev/sdm >>> --> --destroy was not specified, but zapping a whole device will >>> remove the partition table >>> Running command: /usr/bin/dd if=/dev/zero of=/dev/sdm bs=1M count=10 >>> conv=fsync >>> stderr: 10+0 records in >>> 10+0 records out >>> stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0675647 s, 155 MB/s >>> --> Zapping successful for: <Raw Device: /dev/sdm> >>> >>> >>> root@ceph-a2-01:/# ceph orch device ls >>> >>> ceph-a1-06 /dev/sdm hdd TOSHIBA_X_X 16.0T 21m ago *locked* >>> >>> >>> It shows locked and is not automatically added now, which is good i >>> think? otherwise it would probably be a new osd 307. >>> >>> >>> root@ceph-a2-01:/# ceph orch osd rm status >>> No OSD remove/replace operations reported >>> >>> root@ceph-a2-01:/# ceph orch osd rm 232 --replace >>> Unable to find OSDs: ['232'] >>> >>> >>> Unfortunately it is still not replacing. >>> >>> >>> It is so weird, i tried this procedure exactly in my virtual ceph >>> environment and it just worked. The real scenario is acting up now. -.- >>> >>> >>> Do you have more hints for me? >>> >>> Thank you for your help so far! >>> >>> >>> Best >>> >>> Ken >>> >>> >>> On 30.01.23 15:46, David Orman wrote: >>>> The 'down' status is why it's not being replaced, vs. destroyed, >>>> which would allow the replacement. I'm not sure why --replace lead >>>> to that scenario, but you will probably need to mark it destroyed >>>> for it to be replaced. >>>> >>>> https://docs.ceph.com/en/latest/rados/operations/add-or-rm-osds/#replacing-an-osd >>>> has instructions on the non-orch way of doing that. You only need 1/2. >>>> >>>> You should look through your logs to see what happened that the OSD >>>> was marked down and not destroyed. Obviously, make sure you >>>> understand ramifications before running any commands. :) >>>> >>>> David >>>> >>>> On Mon, Jan 30, 2023, at 04:24, mailing-lists wrote: >>>>> # ceph orch osd rm status >>>>> No OSD remove/replace operations reported >>>>> # ceph orch osd rm 232 --replace >>>>> Unable to find OSDs: ['232'] >>>>> >>>>> It is not finding 232 anymore. It is still shown as down and out in >>>>> the >>>>> Ceph-Dashboard. >>>>> >>>>> >>>>> pgs: 3236 active+clean >>>>> >>>>> >>>>> This is the new disk shown as locked (because unzapped at the moment). >>>>> >>>>> # ceph orch device ls >>>>> >>>>> ceph-a1-06 /dev/sdm hdd TOSHIBA_X_X 16.0T 9m ago >>>>> locked >>>>> >>>>> >>>>> Best >>>>> >>>>> Ken >>>>> >>>>> >>>>> On 29.01.23 18:19, David Orman wrote: >>>>>> What does "ceph orch osd rm status" show before you try the zap? Is >>>>>> your cluster still backfilling to the other OSDs for the PGs that >>>>>> were >>>>>> on the failed disk? >>>>>> >>>>>> David >>>>>> >>>>>> On Fri, Jan 27, 2023, at 03:25, mailing-lists wrote: >>>>>>> Dear Ceph-Users, >>>>>>> >>>>>>> i am struggling to replace a disk. My ceph-cluster is not >>>>>>> replacing the >>>>>>> old OSD even though I did: >>>>>>> >>>>>>> ceph orch osd rm 232 --replace >>>>>>> >>>>>>> The OSD 232 is still shown in the osd list, but the new hdd will be >>>>>>> placed as a new OSD. This wouldnt mind me much, if the OSD was also >>>>>>> placed on the bluestoreDB / NVME, but it doesn't. >>>>>>> >>>>>>> >>>>>>> My steps: >>>>>>> >>>>>>> "ceph orch osd rm 232 --replace" >>>>>>> >>>>>>> remove the failed hdd. >>>>>>> >>>>>>> add the new one. >>>>>>> >>>>>>> Convert the disk within the servers bios, so that the node can have >>>>>>> direct access on it. >>>>>>> >>>>>>> It shows up as /dev/sdt, >>>>>>> >>>>>>> enter maintenance mode >>>>>>> >>>>>>> reboot server >>>>>>> >>>>>>> drive is now /dev/sdm (which the old drive had) >>>>>>> >>>>>>> "ceph orch device zap node-x /dev/sdm" >>>>>>> >>>>>>> A new OSD is placed on the cluster. >>>>>>> >>>>>>> >>>>>>> Can you give me a hint, where did I take a wrong turn? Why is the >>>>>>> disk >>>>>>> not being used as OSD 232? >>>>>>> >>>>>>> >>>>>>> Best >>>>>>> >>>>>>> Ken >>>>>>> >>>>>>> _______________________________________________ >>>>>>> ceph-users mailing list --ceph-users@xxxxxxx >>>>>>> To unsubscribe send an email toceph-users-leave@xxxxxxx >>>>>> _______________________________________________ >>>>>> ceph-users mailing list --ceph-users@xxxxxxx >>>>>> To unsubscribe send an email toceph-users-leave@xxxxxxx >>>>> _______________________________________________ >>>>> ceph-users mailing list --ceph-users@xxxxxxx >>>>> To unsubscribe send an email toceph-users-leave@xxxxxxx >>>> _______________________________________________ >>>> ceph-users mailing list --ceph-users@xxxxxxx >>>> To unsubscribe send an email toceph-users-leave@xxxxxxx >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx