Hi Sebastian!! (solution below) This is weird, because we had previously tested the ceph-volume refactor and it looked ok. Anyway, here is the inventory output: https://pastebin.com/ADFeuNZi And the ceph-volume log is here: https://termbin.com/i8mk I couldn't digest why it was rejected. I believe I'm using ceph-volume as intended... https://docs.ceph.com/en/latest/ceph-volume/lvm/batch/#idempotency-and-disk-replacements Wait -- I solved my problem. The OSDs were originally created like this: ceph-volume lvm batch /dev/sd[g-z] /dev/sda[a-d] --db-devices /dev/sd[c-f] Now in order to recreate the osd on sdg, I had tried: ceph-volume lvm batch /dev/sdg --db-devices /dev/sdf --osd-id 1 That doesn't work. But, if I use all devices again, it works! # ceph-volume lvm batch /dev/sd[g-z] /dev/sda[a-d] --db-devices /dev/sd[c-f] --osd-ids 1 --> passed data devices: 24 physical, 0 LVM --> relative data size: 1.0 --> passed block_db devices: 4 physical, 0 LVM Total OSDs: 1 Type Path LV Size % of device ---------------------------------------------------------------------------------------------------- OSD id 1 data /dev/sdg 5.46 TB 100.00% block_db /dev/sdf 37.26 GB 16.67% --> The above OSDs would be created if the operation continues --> do you want to proceed? (yes/no) There are several interesting behaviours -- if I pass fewer db-devices or HDDs it doesn't work as expected. So lesson learned, in recent ceph releases one must use ceph-volume batch exactly as it was used originally. Best Regards, Dan On Tue, Apr 27, 2021 at 2:16 PM Sebastien Han <shan@xxxxxxxxxx> wrote: > > Hi Dan, > > I believe either the ceph-volume logs or the "ceph-volume inventory > /dev/sdf" command should give you the reason why the device was > rejected. > If not legit that's probably a bug... > > Thanks! > ––––––––– > Sébastien Han > Senior Principal Software Engineer, Storage Architect > > "Always give 100%. Unless you're giving blood." > > On Tue, Apr 27, 2021 at 2:02 PM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > > > > Hi all, > > > > In 14.2.20, when re-creating a mixed OSD after device replacement, > > ceph-volume batch is no longer able to find any available space for a > > block_db. > > > > Below I have shown a zap [1] which frees up the HDD and one LV on the > > block-dbs VG. > > But then we try to recreate, and none of the block-dbs are available > > [2], even though there is free space on the VG: > > > > VG #PV #LV #SN Attr VSize VFree > > ceph-8dfd7f83-b60c-485b-9517-12203301a914 1 5 0 wz--n- <223.57g 37.26g > > > > This bug looks similar: https://tracker.ceph.com/issues/49096 > > > > Is there something wrong with my procedure? Or does someone have an > > idea how to make this work again? > > > > Best Regards, > > > > Dan > > > > > > [1] > > > > # systemctl stop ceph-osd@1 > > # ceph osd out 1 > > marked out osd.1. > > # ceph-volume lvm zap --osd-id=1 --destroy > > --> Zapping: /dev/ceph-15daeeaa-b6d9-46a6-b955-fd7197341334/osd-block-ef46b9bf-c85f-49d8-9db3-9ed164f5cc61 > > --> Unmounting /var/lib/ceph/osd/ceph-1 > > Running command: /usr/bin/umount -v /var/lib/ceph/osd/ceph-1 > > stderr: umount: /var/lib/ceph/osd/ceph-1 unmounted > > Running command: /usr/bin/dd if=/dev/zero > > of=/dev/ceph-15daeeaa-b6d9-46a6-b955-fd7197341334/osd-block-ef46b9bf-c85f-49d8-9db3-9ed164f5cc61 > > bs=1M count=10 conv=fsync > > stderr: 10+0 records in > > 10+0 records out > > stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0849557 s, 123 MB/s > > --> Only 1 LV left in VG, will proceed to destroy volume group > > ceph-15daeeaa-b6d9-46a6-b955-fd7197341334 > > Running command: /usr/sbin/vgremove -v -f > > ceph-15daeeaa-b6d9-46a6-b955-fd7197341334 > > stderr: Removing > > ceph--15daeeaa--b6d9--46a6--b955--fd7197341334-osd--block--ef46b9bf--c85f--49d8--9db3--9ed164f5cc61 > > (253:0) > > stderr: Archiving volume group > > "ceph-15daeeaa-b6d9-46a6-b955-fd7197341334" metadata (seqno 5). > > stderr: Releasing logical volume > > "osd-block-ef46b9bf-c85f-49d8-9db3-9ed164f5cc61" > > stderr: Creating volume group backup > > "/etc/lvm/backup/ceph-15daeeaa-b6d9-46a6-b955-fd7197341334" (seqno 6). > > stdout: Logical volume > > "osd-block-ef46b9bf-c85f-49d8-9db3-9ed164f5cc61" successfully removed > > stderr: Removing physical volume "/dev/sdg" from volume group > > "ceph-15daeeaa-b6d9-46a6-b955-fd7197341334" > > stdout: Volume group "ceph-15daeeaa-b6d9-46a6-b955-fd7197341334" > > successfully removed > > --> Zapping: /dev/ceph-8dfd7f83-b60c-485b-9517-12203301a914/osd-db-d84267a8-057a-4b54-b1d4-7894e3eabec0 > > Running command: /usr/bin/dd if=/dev/zero > > of=/dev/ceph-8dfd7f83-b60c-485b-9517-12203301a914/osd-db-d84267a8-057a-4b54-b1d4-7894e3eabec0 > > bs=1M count=10 conv=fsync > > stderr: 10+0 records in > > 10+0 records out > > stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0310497 s, 338 MB/s > > --> More than 1 LV left in VG, will proceed to destroy LV only > > --> Removing LV because --destroy was given: > > /dev/ceph-8dfd7f83-b60c-485b-9517-12203301a914/osd-db-d84267a8-057a-4b54-b1d4-7894e3eabec0 > > Running command: /usr/sbin/lvremove -v -f > > /dev/ceph-8dfd7f83-b60c-485b-9517-12203301a914/osd-db-d84267a8-057a-4b54-b1d4-7894e3eabec0 > > stdout: Logical volume "osd-db-d84267a8-057a-4b54-b1d4-7894e3eabec0" > > successfully removed > > stderr: Removing > > ceph--8dfd7f83--b60c--485b--9517--12203301a914-osd--db--d84267a8--057a--4b54--b1d4--7894e3eabec0 > > (253:1) > > stderr: Archiving volume group > > "ceph-8dfd7f83-b60c-485b-9517-12203301a914" metadata (seqno 25). > > stderr: Releasing logical volume "osd-db-d84267a8-057a-4b54-b1d4-7894e3eabec0" > > stderr: Creating volume group backup > > "/etc/lvm/backup/ceph-8dfd7f83-b60c-485b-9517-12203301a914" (seqno > > 26). > > --> Zapping successful for OSD: 1 > > # > > > > [2] > > > > # ceph-volume lvm batch /dev/sdg --db-devices /dev/sdf --osd-ids 1 > > --> passed data devices: 1 physical, 0 LVM > > --> relative data size: 1.0 > > --> passed block_db devices: 1 physical, 0 LVM > > --> 1 fast devices were passed, but none are available > > > > Total OSDs: 0 > > > > Type Path > > LV Size % of device > > --> The above OSDs would be created if the operation continues > > --> do you want to proceed? (yes/no) no > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx