Re: cephadm automatic sizing of WAL/DB on SSD

Anh Phan Tuan <anhphan.net@xxxxxxxxx> · Wed, 5 Oct 2022 18:21:12 +0700

It seems the 17.2.4 release has fixed this.

ceph-volume: fix fast device alloc size on mulitple device (pr#47293,
> Arthur Outhenin-Chalandre)

Bug #56031: batch compute a lower size than what it should be for blockdb
with multiple fast device - ceph-volume - Ceph
<https://tracker.ceph.com/issues/56031>

Regards,
Anh Phan

On Fri, Sep 16, 2022 at 2:34 AM Christophe BAILLON <cb@xxxxxxx> wrote:

> Hi
>
> The problem is still present in version 17.2.3,
> thanks for the trick to work around...
>
> Regards
>
> ----- Mail original -----
> > De: "Anh Phan Tuan" <anhphan.net@xxxxxxxxx>
> > À: "Calhoun, Patrick" <phineas@xxxxxx>
> > Cc: "Arthur Outhenin-Chalandre" <arthur.outhenin-chalandre@xxxxxxx>,
> "ceph-users" <ceph-users@xxxxxxx>
> > Envoyé: Jeudi 11 Août 2022 10:14:17
> > Objet:  Re: cephadm automatic sizing of WAL/DB on SSD
>
> > Hi Patrick,
> >
> > I am also facing this bug when deploying a new cluster at the time 16.2.7
> > release.
> >
> > The bugs relative to the way ceph calculator db_size form give db disk.
> >
> > Instead of : slot db size = size of db disk / num slot per disk.
> > Ceph calculated the value: slot db size = size of db disk (just one
> disk) /
> > total number of slots needed (number of osd prepared in that time).
> >
> > In your case, you have 2 db disks, It will make the db size only 50% of
> the
> > corrected value.
> > In my case, I have 4 db disks per host, It makes the db size only 25% of
> > the corrected value.
> >
> > This bug happens even when you deploy by batch command.
> > In that time, I finally used to work around by batch command but only
> > deploy all osd relative to one db disk a time, in this case ceph
> calculated
> > the correct value.
> >
> > Cheers,
> > Anh Phan
> >
> >
> >
> > On Sat, Jul 30, 2022 at 12:31 AM Calhoun, Patrick <phineas@xxxxxx>
> wrote:
> >
> >> Thanks, Arthur,
> >>
> >> I think you are right about that bug looking very similar to what I've
> >> observed. I'll try to remember to update the list once the fix is merged
> >> and released and I get a chance to test it.
> >>
> >> I'm hoping somebody can comment on what are ceph's current best
> practices
> >> for sizing WAL/DB volumes, considering rocksdb levels and compaction.
> >>
> >> -Patrick
> >>
> >> ________________________________
> >> From: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@xxxxxxx>
> >> Sent: Friday, July 29, 2022 2:11 AM
> >> To: ceph-users@xxxxxxx <ceph-users@xxxxxxx>
> >> Subject:  Re: cephadm automatic sizing of WAL/DB on SSD
> >>
> >> Hi Patrick,
> >>
> >> On 7/28/22 16:22, Calhoun, Patrick wrote:
> >> > In a new OSD node with 24 hdd (16 TB each) and 2 ssd (1.44 TB each),
> I'd
> >> like to have "ceph orch" allocate WAL and DB on the ssd devices.
> >> >
> >> > I use the following service spec:
> >> > spec:
> >> >   data_devices:
> >> >     rotational: 1
> >> >     size: '14T:'
> >> >   db_devices:
> >> >     rotational: 0
> >> >     size: '1T:'
> >> >   db_slots: 12
> >> >
> >> > This results in each OSD having a 60GB volume for WAL/DB, which
> equates
> >> to 50% total usage in the VG on each ssd, and 50% free.
> >> > I honestly don't know what size to expect, but exactly 50% of capacity
> >> makes me suspect this is due to a bug:
> >> > https://tracker.ceph.com/issues/54541
> >> > (In fact, I had run into this bug when specifying block_db_size rather
> >> than db_slots)
> >> >
> >> > Questions:
> >> >   Am I being bit by that bug?
> >> >   Is there a better approach, in general, to my situation?
> >> >   Are DB sizes still governed by the rocksdb tiering? (I thought that
> >> this was mostly resolved by https://github.com/ceph/ceph/pull/29687 )
> >> >   If I provision a DB/WAL logical volume size to 61GB, is that
> >> effectively a 30GB database, and 30GB of extra room for compaction?
> >>
> >> I don't use cephadm, but it's maybe related to this regression:
> >> https://tracker.ceph.com/issues/56031. At list the symptoms looks very
> >> similar...
> >>
> >> Cheers,
> >>
> >> --
> >> Arthur Outhenin-Chalandre
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
> --
> Christophe BAILLON
> Mobile :: +336 16 400 522
> Work :: https://eyona.com
> Twitter :: https://twitter.com/ctof
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx