Re: Move block.db to new ssd

Alexander Patrakov <patrakov@xxxxxxxxx> · Tue, 12 Nov 2024 20:32:43 +0800

Hello Frédéric,

The advice regarding 30/300 GB DB sizes is no longer valid. Since Ceph
15.2.8, due to the new default (bluestore_volume_selection_policy =
use_some_extra), it no longer wastes the extra capacity of the DB
device.

On Tue, Nov 12, 2024 at 5:52 PM Frédéric Nass
<frederic.nass@xxxxxxxxxxxxxxxx> wrote:
>
>
>
> ----- Le 12 Nov 24, à 8:51, Roland Giesler roland@xxxxxxxxxxxxxx a écrit :
>
> > On 2024/11/12 04:54, Alwin Antreich wrote:
> >> Hi Roland,
> >>
> >> On Mon, Nov 11, 2024, 20:16 Roland Giesler <roland@xxxxxxxxxxxxxx> wrote:
> >>
> >>> I have ceph 17.2.6 on a proxmox cluster and want to replace some ssd's
> >>> who are end of life.  I have some spinners who have their journals on
> >>> SSD.  Each spinner has a 50GB SSD LVM partition and I want to move those
> >>> each to new corresponding partitions.
> >>>
> >>> The new 4TB SSD's I have split into volumes with:
> >>>
> >>> # lvcreate -n NodeA-nvme-LV-RocksDB1 -L 47.69g NodeA-nvme0
> >>> # lvcreate -n NodeA-nvme-LV-RocksDB2 -L 47.69g NodeA-nvme0
> >>> # lvcreate -n NodeA-nvme-LV-RocksDB3 -L 47.69g NodeA-nvme0
> >>> # lvcreate -n NodeA-nvme-LV-RocksDB4 -L 47.69g NodeA-nvme0
> >>> # lvcreate -n NodeA-nvme-LV-data -l 100%FREE NodeA-nvme1
> >>> # lvcreate -n NodeA-nvme-LV-data -l 100%FREE NodeA-nvme0
> >>>
> >> I caution the mix of DB/WAL partitions with other applications. The
> >> performance profile may not be suited for shared use. And depending on the
> >> use case the ~48GB might not be big enough to hinder DB spillover. See the
> >> current size when querying the OSD.
> >
> > I see relatively small RocksDB and not WAL?
> >
> > ceph daemon osd.4 perf dump
> > <snip>
> >     "bluefs": {
> >         "db_total_bytes": 45025845248,
> >         "db_used_bytes": 2131755008,
> >         "wal_total_bytes": 0,
> >         "wal_used_bytes": 0,
> > </snip>
> >
> > I have been led to understand that 4% is die high end and only on very busy
> > systems is that reached, if ever?
>
> Hi Roland,
>
> This is generally true but it depends on what your cluster is used for.
>
> If your cluster is used for block (RBD) storage then 1%-2% should be enough. If your cluster is used for file (cephfs) and S3 (RGW) storage then you'd rather stay on the safe size and respect the 4% recommendation as these workloads make heavy use of block.db to store metadata.
>
> Now percentage is one thing, level size is another. To avoid overspilling when block.db size approaches 30GB you'd better choose a block.db size of 300GB+ whatever the percentage of block size this is, if you don't want to play with rocksdb level size and multiplier, which you probably don't.
>
> Regards,
> Frédéric.
>
> [1] https://docs.ceph.com/en/latest/rados/configuration/bluestore-config-ref/#sizing
> [2] https://www.ibm.com/docs/en/storage-ceph/7.1?topic=bluestore-sizing-considerations
> [3] https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide
>
> >
> >>> What am I missing to get these changes to be permanent?
> >>>
> >> Likely just an issue with the order of execution. But there is an easier
> >> way to do the move. See:
> >> https://docs.ceph.com/en/quincy/ceph-volume/lvm/migrate/
> >
> > Ah, excellent!  I didn't find that in my searches.  Will try that now.
> >
> > regards
> >
> > Roland
> >
> >
> >>
> >> Cheers,
> >> Alwin
> >>
> >> --
> >>
> >>> Alwin Antreich
> >> Head of Training and Proxmox Services
> >>
> >> croit GmbH, Freseniusstr. 31h, 81247 Munich
> >> CEO: Martin Verges, Andy Muthmann - VAT-ID: DE310638492
> >> Com. register: Amtsgericht Munich HRB 231263
> >> Web: https://croit.io/
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

-- 
Alexander Patrakov
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx