Re: cephadm to setup wal/db on nvme

Satish Patel <satish.txt@xxxxxxxxx> · Fri, 25 Aug 2023 16:29:42 -0400

I have replaced Samsung with Intel P4600 6.4TB nvme (I have created 3 OSDs
on top of nvme)

Here is the result:

(venv-openstack) root@os-ctrl1:~# rados -p test-nvme -t 64 -b 4096
bench 10 write
hints = 1
Maintaining 64 concurrent writes of 4096 bytes to objects of size 4096
for up to 10 seconds or 0 objects
Object prefix: benchmark_data_os-ctrl1_1030914
  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
    0       0         0         0         0         0           -           0
    1      63     31188     31125    121.56   121.582 0.000996695  0.00205185
    2      63     67419     67356   131.529   141.527  0.00158563  0.00189714
    3      63    101483    101420   132.033   133.062  0.00311369  0.00189039
    4      64    135147    135083   131.893   131.496  0.00132065  0.00189281
    5      63    169856    169793   132.628   135.586  0.00163604   0.0018825
    6      64    204437    204373   133.032   135.078 0.000880165  0.00187612
    7      63    239369    239306   133.518   136.457  0.00215911  0.00187017
    8      64    274318    274254    133.89   136.516  0.00130235  0.00186506
    9      63    309388    309325   134.233   136.996  0.00134813  0.00186031
   10       1    343849    343848   134.293   134.855  0.00205662  0.00185956
Total time run:         10.0018
Total writes made:      343849
Write size:             4096
Object size:            4096
Bandwidth (MB/sec):     134.292
Stddev Bandwidth:       5.1937
Max bandwidth (MB/sec): 141.527
Min bandwidth (MB/sec): 121.582
Average IOPS:           34378
Stddev IOPS:            1329.59
Max IOPS:               36231
Min IOPS:               31125
Average Latency(s):     0.00185956
Stddev Latency(s):      0.00161079
Max latency(s):         0.107432
Min latency(s):         0.000603733
Cleaning up (deleting benchmark objects)
Removed 343849 objects
Clean up completed and total clean up time :8.41907

On Fri, Aug 25, 2023 at 2:33 PM Anthony D'Atri <anthony.datri@xxxxxxxxx>
wrote:

>
>
> > Thank you for reply,
> >
> > I have created two class SSD and NvME and assigned them to crush maps.
>
> You don't have enough drives to keep them separate.  Set the NVMe drives
> back to "ssd" and just make one pool.
>
> >
> > $ ceph osd crush rule ls
> > replicated_rule
> > ssd_pool
> > nvme_pool
> >
> >
> > Running benchmarks on nvme is the worst performing. SSD showing much
> better
> > results compared to NvME.
>
> You have more SATA SSDs and thus more OSDs, than NVMe SSDs.
>
>
> > NvME model is Samsung_SSD_980_PRO_1TB
>
> Client-grade, don't expect much from it.
>
>
> >
> > #### NvME pool benchmark with 3x replication
> >
> > # rados -p test-nvme -t 64 -b 4096 bench 10 write
> > hints = 1
> > Maintaining 64 concurrent writes of 4096 bytes to objects of size 4096
> for
> > up to 10 seconds or 0 objects
> > Object prefix: benchmark_data_os-ctrl1_1931595
> >  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg
> > lat(s)
> >    0       0         0         0         0         0           -
> > 0
> >    1      64      5541      5477   21.3917   21.3945   0.0134898
> > 0.0116529
> >    2      64     11209     11145   21.7641   22.1406  0.00939951
> > 0.0114506
> >    3      64     17036     16972   22.0956   22.7617  0.00938263
> > 0.0112938
> >    4      64     23187     23123   22.5776   24.0273  0.00863939
> > 0.0110473
> >    5      64     29753     29689   23.1911   25.6484  0.00925603
> > 0.0107662
> >    6      64     36222     36158   23.5369   25.2695   0.0100759
> > 0.010606
> >    7      63     42997     42934   23.9551   26.4688  0.00902186
> > 0.0104246
> >    8      64     49859     49795   24.3102   26.8008  0.00884379
> > 0.0102765
> >    9      64     56429     56365   24.4601   25.6641  0.00989885
> > 0.0102124
> >   10      31     62727     62696   24.4869   24.7305   0.0115833
> > 0.0102027
> > Total time run:         10.0064
> > Total writes made:      62727
> > Write size:             4096
> > Object size:            4096
> > Bandwidth (MB/sec):     24.4871
> > Stddev Bandwidth:       1.85423
> > Max bandwidth (MB/sec): 26.8008       <------------   Only 26MB/s for
> nvme
> > disk
> > Min bandwidth (MB/sec): 21.3945
> > Average IOPS:           6268
> > Stddev IOPS:            474.683
> > Max IOPS:               6861
> > Min IOPS:               5477
> > Average Latency(s):     0.0102022
> > Stddev Latency(s):      0.00170505
> > Max latency(s):         0.0365743
> > Min latency(s):         0.00641319
> > Cleaning up (deleting benchmark objects)
> > Removed 62727 objects
> > Clean up completed and total clean up time :8.23223
> >
> >
> >
> > ### SSD pool benchmark
> >
> > (venv-openstack) root@os-ctrl1:~# rados -p test-ssd -t 64 -b 4096 bench
> 10
> > write
> > hints = 1
> > Maintaining 64 concurrent writes of 4096 bytes to objects of size 4096
> for
> > up to 10 seconds or 0 objects
> > Object prefix: benchmark_data_os-ctrl1_1933383
> >  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg
> > lat(s)
> >    0       0         0         0         0         0           -
> > 0
> >    1      63     43839     43776   170.972       171 0.000991462
> > 0.00145833
> >    2      64     92198     92134   179.921   188.898  0.00211419
> > 0.001387
> >    3      64    141917    141853   184.675   194.215  0.00106326
> > 0.00135174
> >    4      63    193151    193088   188.534   200.137  0.00179379
> > 0.00132423
> >    5      63    243104    243041   189.847   195.129 0.000831263
> > 0.00131512
> >    6      63    291045    290982   189.413    187.27  0.00120208
> > 0.00131807
> >    7      64    341295    341231   190.391   196.285  0.00102127
> > 0.00131137
> >    8      63    393336    393273   191.999   203.289 0.000958149
> > 0.00130041
> >    9      63    442459    442396   191.983   191.887  0.00123453
> > 0.00130053
> > Total time run:         10.0008
> > Total writes made:      488729
> > Write size:             4096
> > Object size:            4096
> > Bandwidth (MB/sec):     190.894
> > Stddev Bandwidth:       9.35224
> > Max bandwidth (MB/sec): 203.289
> > Min bandwidth (MB/sec): 171
> > Average IOPS:           48868
> > Stddev IOPS:            2394.17
> > Max IOPS:               52042
> > Min IOPS:               43776
> > Average Latency(s):     0.00130796
> > Stddev Latency(s):      0.000604629
> > Max latency(s):         0.0268462
> > Min latency(s):         0.000628738
> > Cleaning up (deleting benchmark objects)
> > Removed 488729 objects
> > Clean up completed and total clean up time :8.84114
> >
> >
> >
> >
> >
> >
> >
> >
> > On Wed, Aug 23, 2023 at 1:25 PM Adam King <adking@xxxxxxxxxx> wrote:
> >
> >> this should be possible by specifying a "data_devices" and "db_devices"
> >> fields in the OSD spec file each with different filters. There's some
> >> examples in the docs
> >> https://docs.ceph.com/en/latest/cephadm/services/osd/#the-simple-case
> that
> >> show roughly how that's done, and some other sections (
> >> https://docs.ceph.com/en/latest/cephadm/services/osd/#filters) that go
> >> more in depth on the different filtering options available so you can
> try
> >> and find one that works for your disks. You can check the output of
> "ceph
> >> orch device ls --format json | jq" to see things like what cephadm
> >> considers the model, size etc. for the devices to be for use in the
> >> filtering.
> >>
> >> On Wed, Aug 23, 2023 at 1:13 PM Satish Patel <satish.txt@xxxxxxxxx>
> wrote:
> >>
> >>> Folks,
> >>>
> >>> I have 3 nodes with each having 1x NvME (1TB) and 3x 2.9TB SSD. Trying
> to
> >>> build ceph storage using cephadm on Ubuntu 22.04 distro.
> >>>
> >>> If I want to use NvME for Journaling (WAL/DB) for my SSD based OSDs
> then
> >>> how does cephadm handle it?
> >>>
> >>> Trying to find a document where I can tell cephadm to deploy wal/db on
> >>> nvme
> >>> so it can speed up write optimization. Do I need to create or cephadm
> will
> >>> create each partition for the number of OSD?
> >>>
> >>> Help me to understand how it works and is it worth doing?
> >>> _______________________________________________
> >>> ceph-users mailing list -- ceph-users@xxxxxxx
> >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>>
> >>>
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx