Re: ceph-volume batch does not find available block_db

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sebastian!!

(solution below)

This is weird, because we had previously tested the ceph-volume
refactor and it looked ok.
Anyway, here is the inventory output: https://pastebin.com/ADFeuNZi
And the ceph-volume log is here: https://termbin.com/i8mk
I couldn't digest why it was rejected.

I believe I'm using ceph-volume as intended...
https://docs.ceph.com/en/latest/ceph-volume/lvm/batch/#idempotency-and-disk-replacements


Wait -- I solved my problem.
The OSDs were originally created like this:
ceph-volume lvm batch /dev/sd[g-z] /dev/sda[a-d] --db-devices /dev/sd[c-f]

Now in order to recreate the osd on sdg, I had tried: ceph-volume lvm
batch /dev/sdg --db-devices /dev/sdf --osd-id 1
That doesn't work.

But, if I use all devices again, it works!

# ceph-volume lvm batch /dev/sd[g-z] /dev/sda[a-d] --db-devices
/dev/sd[c-f] --osd-ids 1
--> passed data devices: 24 physical, 0 LVM
--> relative data size: 1.0
--> passed block_db devices: 4 physical, 0 LVM

Total OSDs: 1

  Type            Path
   LV Size         % of device
----------------------------------------------------------------------------------------------------
  OSD id          1
  data            /dev/sdg
   5.46 TB         100.00%
  block_db        /dev/sdf
   37.26 GB        16.67%
--> The above OSDs would be created if the operation continues
--> do you want to proceed? (yes/no)

There are several interesting behaviours -- if I pass fewer db-devices
or HDDs it doesn't work as expected.

So lesson learned, in recent ceph releases one must use ceph-volume
batch exactly as it was used originally.

Best Regards,
Dan




On Tue, Apr 27, 2021 at 2:16 PM Sebastien Han <shan@xxxxxxxxxx> wrote:
>
> Hi Dan,
>
> I believe either the ceph-volume logs or the "ceph-volume inventory
> /dev/sdf" command should give you the reason why the device was
> rejected.
> If not legit that's probably a bug...
>
> Thanks!
> –––––––––
> Sébastien Han
> Senior Principal Software Engineer, Storage Architect
>
> "Always give 100%. Unless you're giving blood."
>
> On Tue, Apr 27, 2021 at 2:02 PM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
> >
> > Hi all,
> >
> > In 14.2.20, when re-creating a mixed OSD after device replacement,
> > ceph-volume batch is no longer able to find any available space for a
> > block_db.
> >
> > Below I have shown a zap [1] which frees up the HDD and one LV on the
> > block-dbs VG.
> > But then we try to recreate, and none of the block-dbs are available
> > [2], even though there is free space on the VG:
> >
> >   VG                                        #PV #LV #SN Attr   VSize    VFree
> >   ceph-8dfd7f83-b60c-485b-9517-12203301a914   1   5   0 wz--n- <223.57g 37.26g
> >
> > This bug looks similar: https://tracker.ceph.com/issues/49096
> >
> > Is there something wrong with my procedure? Or does someone have an
> > idea how to make this work again?
> >
> > Best Regards,
> >
> > Dan
> >
> >
> > [1]
> >
> > # systemctl stop ceph-osd@1
> > # ceph osd out 1
> > marked out osd.1.
> > # ceph-volume lvm zap --osd-id=1 --destroy
> > --> Zapping: /dev/ceph-15daeeaa-b6d9-46a6-b955-fd7197341334/osd-block-ef46b9bf-c85f-49d8-9db3-9ed164f5cc61
> > --> Unmounting /var/lib/ceph/osd/ceph-1
> > Running command: /usr/bin/umount -v /var/lib/ceph/osd/ceph-1
> >  stderr: umount: /var/lib/ceph/osd/ceph-1 unmounted
> > Running command: /usr/bin/dd if=/dev/zero
> > of=/dev/ceph-15daeeaa-b6d9-46a6-b955-fd7197341334/osd-block-ef46b9bf-c85f-49d8-9db3-9ed164f5cc61
> > bs=1M count=10 conv=fsync
> >  stderr: 10+0 records in
> > 10+0 records out
> >  stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0849557 s, 123 MB/s
> > --> Only 1 LV left in VG, will proceed to destroy volume group
> > ceph-15daeeaa-b6d9-46a6-b955-fd7197341334
> > Running command: /usr/sbin/vgremove -v -f
> > ceph-15daeeaa-b6d9-46a6-b955-fd7197341334
> >  stderr: Removing
> > ceph--15daeeaa--b6d9--46a6--b955--fd7197341334-osd--block--ef46b9bf--c85f--49d8--9db3--9ed164f5cc61
> > (253:0)
> >  stderr: Archiving volume group
> > "ceph-15daeeaa-b6d9-46a6-b955-fd7197341334" metadata (seqno 5).
> >  stderr: Releasing logical volume
> > "osd-block-ef46b9bf-c85f-49d8-9db3-9ed164f5cc61"
> >  stderr: Creating volume group backup
> > "/etc/lvm/backup/ceph-15daeeaa-b6d9-46a6-b955-fd7197341334" (seqno 6).
> >  stdout: Logical volume
> > "osd-block-ef46b9bf-c85f-49d8-9db3-9ed164f5cc61" successfully removed
> >  stderr: Removing physical volume "/dev/sdg" from volume group
> > "ceph-15daeeaa-b6d9-46a6-b955-fd7197341334"
> >  stdout: Volume group "ceph-15daeeaa-b6d9-46a6-b955-fd7197341334"
> > successfully removed
> > --> Zapping: /dev/ceph-8dfd7f83-b60c-485b-9517-12203301a914/osd-db-d84267a8-057a-4b54-b1d4-7894e3eabec0
> > Running command: /usr/bin/dd if=/dev/zero
> > of=/dev/ceph-8dfd7f83-b60c-485b-9517-12203301a914/osd-db-d84267a8-057a-4b54-b1d4-7894e3eabec0
> > bs=1M count=10 conv=fsync
> >  stderr: 10+0 records in
> > 10+0 records out
> >  stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0310497 s, 338 MB/s
> > --> More than 1 LV left in VG, will proceed to destroy LV only
> > --> Removing LV because --destroy was given:
> > /dev/ceph-8dfd7f83-b60c-485b-9517-12203301a914/osd-db-d84267a8-057a-4b54-b1d4-7894e3eabec0
> > Running command: /usr/sbin/lvremove -v -f
> > /dev/ceph-8dfd7f83-b60c-485b-9517-12203301a914/osd-db-d84267a8-057a-4b54-b1d4-7894e3eabec0
> >  stdout: Logical volume "osd-db-d84267a8-057a-4b54-b1d4-7894e3eabec0"
> > successfully removed
> >  stderr: Removing
> > ceph--8dfd7f83--b60c--485b--9517--12203301a914-osd--db--d84267a8--057a--4b54--b1d4--7894e3eabec0
> > (253:1)
> >  stderr: Archiving volume group
> > "ceph-8dfd7f83-b60c-485b-9517-12203301a914" metadata (seqno 25).
> >  stderr: Releasing logical volume "osd-db-d84267a8-057a-4b54-b1d4-7894e3eabec0"
> >  stderr: Creating volume group backup
> > "/etc/lvm/backup/ceph-8dfd7f83-b60c-485b-9517-12203301a914" (seqno
> > 26).
> > --> Zapping successful for OSD: 1
> > #
> >
> > [2]
> >
> > # ceph-volume lvm batch /dev/sdg --db-devices /dev/sdf --osd-ids 1
> > --> passed data devices: 1 physical, 0 LVM
> > --> relative data size: 1.0
> > --> passed block_db devices: 1 physical, 0 LVM
> > --> 1 fast devices were passed, but none are available
> >
> > Total OSDs: 0
> >
> >   Type            Path
> >    LV Size         % of device
> > --> The above OSDs would be created if the operation continues
> > --> do you want to proceed? (yes/no) no
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux