Re: RBD Performance issue

"vignesh varma" <vignesh.varma.g@xxxxxxxxxxxxx> · Mon, 17 Feb 2025 13:16:19 -0000



Hi Darren & Anthony

>>How many PG’s have you got configured for the Ceph pool that you are testing against?
I have crealed the cloudstack pools with the PG of 64.

ceph osd pool get cloudstack-BRK pg_autoscale_mode
pg_autoscale_mode: on

ceph osd pool ls
.mgr
cloudstack-GUL
cloudstack-BRK
.nfs
cephfs.cephfs.meta
cephfs.cephfs.data


>>Have you tried the same benchmark without the replication setup?
NO, I have connected the 3repliction pool to cloudstack

RBD bech 

With replican 1

rados bench -p testbench 10 write --run-name client1
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_STR-01-BRK_342365
  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
    0       0         0         0         0         0           -           0
    1      16       860       844   3375.75      3376  0.00919862   0.0187299
    2      16      1775      1759   3517.55      3660   0.0128823   0.0180904
    3      16      2773      2757   3675.46      3992   0.0193005   0.0173626
    4      16      3759      3743   3742.45      3944  0.00819503   0.0170552
    5      16      4743      4727   3781.04      3936   0.0202672   0.0169013
    6      16      5766      5750   3832.77      4092   0.0124358   0.0166735
    7      16      6783      6767   3866.28      4068   0.0235515   0.0165316
    8      16      7761      7745   3871.91      3912   0.0162872   0.0165084
    9      16      8791      8775   3899.39      4120   0.0159618   0.0163925
   10      16      9810      9794   3916.99      4076   0.0267244   0.0163123
Total time run:         10.0108
Total writes made:      9810
Write size:             4194304
Object size:            4194304
Bandwidth (MB/sec):     3919.78
Stddev Bandwidth:       232.231
Max bandwidth (MB/sec): 4120
Min bandwidth (MB/sec): 3376
Average IOPS:           979
Stddev IOPS:            58.0578
Max IOPS:               1030
Min IOPS:               844
Average Latency(s):     0.0163137
Stddev Latency(s):      0.00699291
Max latency(s):         0.0683829
Min latency(s):         0.00599489
Cleaning up (deleting benchmark objects)
Removed 9810 objects
Clean up completed and total clean up time :0.47255

rbd bench image01 --pool=testbench --io-size=256M --io-threads=8 --io-type=write
bench  type write io_size 268435456 io_threads 8 bytes 1073741824 pattern sequential
  SEC       OPS   OPS/SEC   BYTES/SEC
elapsed: 5   ops: 4   ops/sec: 0.71125   bytes/sec: 182 MiB/s

With replican 3

rados bench -p rbd_rnd 10 write --run-name client1
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_STR-01-BRK_342454
  sec Cur ops   started  finished  avg MB/s  cur MB/s last lat(s)  avg lat(s)
    0       0         0         0         0         0           -           0
    1      16       421       405   1619.88      1620    0.035311   0.0386264
    2      16       832       816   1631.76      1644   0.0235547   0.0386749
    3      16      1246      1230   1639.73      1656   0.0367235   0.0388173
    4      16      1660      1644   1643.72      1656   0.0444481   0.0387138
    5      16      2090      2074   1658.92      1720   0.0296465   0.0384273
    6      16      2515      2499   1665.72      1700   0.0474488   0.0383093
    7      16      2928      2912   1663.72      1652   0.0404621   0.0383227
    8      16      3356      3340   1669.72      1712   0.0346635   0.0382409
    9      16      3781      3765   1673.04      1700   0.0356325   0.0381586
   10      14      4196      4182   1672.51      1668   0.0455228   0.0381928
Total time run:         10.0196
Total writes made:      4196
Write size:             4194304
Object size:            4194304
Bandwidth (MB/sec):     1675.11
Stddev Bandwidth:       33.1354
Max bandwidth (MB/sec): 1720
Min bandwidth (MB/sec): 1620
Average IOPS:           418
Stddev IOPS:            8.28385
Max IOPS:               430
Min IOPS:               405
Average Latency(s):     0.0381648
Stddev Latency(s):      0.011959
Max latency(s):         0.100015
Min latency(s):         0.0172346
Cleaning up (deleting benchmark objects)
Removed 4196 objects
Clean up completed and total clean up time :0.357656

rbd bench image01 --pool=rbd_rnd --io-size=256M --io-threads=8 --io-type=write
bench  type write io_size 268435456 io_threads 8 bytes 1073741824 pattern sequential
  SEC       OPS   OPS/SEC   BYTES/SEC
elapsed: 5   ops: 4   ops/sec: 0.692054   bytes/sec: 177 MiB/s

>>Which replication are you using snapshot or journal based?

snapshot based replication

>>Can you run a benchmark with 100% read and 100% write to see what the differences are
between them? 

fio --ioengine=libaio --direct=1 --randrepeat=1 --refill_buffers --end_fsync=1 --filename=/root/ceph-rbd --name=write --size=1024m --bs=4k --rw=write --iodepth=32 --numjobs=16
write: IOPS=7840, BW=30.6MiB/s (32.1MB/s)(1024MiB/33434msec); 0 zone resets

fio --ioengine=libaio --direct=1 --randrepeat=1 --refill_buffers --end_fsync=1 --filename=/root/ceph-rbd --name=read --size=1024m --bs=4k --rw=read --iodepth=32 --numjobs=16
 read: IOPS=13.8k, BW=53.9MiB/s (56.5MB/s)(1024MiB/18990msec)


>>How have you configured data protection? Erasure coded or Replicas? 

Replicas

>>Is the machine you are running the benchmark on a cloud stack virtual machine or a bare
metal physical machine?

Inside the virtual machine deployed via cloud stack

>>What is the network connectivity of the machine you are benchmarking on?

25*4 GIB NIC with LACP bond for public and cluster network


>>Send `ceph osd dump | grep pool` and `ceph -s` for both clusters.

Cluster 1

ceph osd dump | grep pool
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 35 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 23.08
pool 2 'cloudstack-GUL' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode on last_change 79031 lfor 0/0/85 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd read_balance_score 2.63
pool 3 'cloudstack-BRK' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode on last_change 78978 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd read_balance_score 2.25
pool 4 '.nfs' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 105 lfor 0/0/103 flags hashpspool stripe_width 0 application nfs read_balance_score 3.00
pool 5 'cephfs.cephfs.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 144 lfor 0/0/114 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs read_balance_score 4.48
pool 6 'cephfs.cephfs.data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode on last_change 144 lfor 0/0/142 flags hashpspool,bulk stripe_width 0 application cephfs read_balance_score 1.27
pool 7 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 3262 lfor 0/0/3260 flags hashpspool stripe_width 0 application rgw read_balance_score 3.00
pool 8 'testbench' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode on last_change 17306 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd read_balance_score 1.88

ceph -s
  cluster:
    id:     1614bf56-70fe-11ef-890c-8d560ac5fab0
    health: HEALTH_OK

  services:
    mon:        3 daemons, quorum STR-01-GUL,STR-02-GUL,STR-03-GUL (age 2M)
    mgr:        STR-01-GUL.lovayi(active, since 2M), standbys: STR-02-GUL.jxryby
    mds:        1/1 daemons up, 1 standby
    osd:        24 osds: 24 up (since 5M), 24 in (since 5M)
    rbd-mirror: 1 daemon active (1 hosts)

  data:
    volumes: 1/1 healthy
    pools:   8 pools, 1297 pgs
    objects: 1.07M objects, 3.9 TiB
    usage:   12 TiB used, 156 TiB / 168 TiB avail
    pgs:     1297 active+clean

  io:
    client:   2.9 MiB/s rd, 7.8 MiB/s wr, 376 op/s rd, 608 op/s wr


Cluster 2

ceph osd dump | grep pool
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 35 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr read_balance_score 23.08
pool 2 'cloudstack-GUL' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode on last_change 85862 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd read_balance_score 2.63
pool 3 'cloudstack-BRK' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode on last_change 85791 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd read_balance_score 1.88
pool 4 '.nfs' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 97 lfor 0/0/95 flags hashpspool stripe_width 0 application nfs read_balance_score 3.00
pool 5 'cephfs.cephfs.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 143 lfor 0/0/113 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs read_balance_score 2.99
pool 6 'cephfs.cephfs.data' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode on last_change 143 lfor 0/0/141 flags hashpspool,bulk stripe_width 0 application cephfs read_balance_score 1.27
pool 9 'rbd_rnd' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 85857 lfor 0/0/85639 flags hashpspool,selfmanaged_snaps max_bytes 536870912000 stripe_width 0 application rbd read_balance_score 3.00
pool 11 'testbench' replicated size 1 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 85838 lfor 0/0/85824 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd read_balance_score 3.71

ceph -s
  cluster:
    id:     d12f0e2a-7100-11ef-ab63-11f6a8d75f5b
    health: HEALTH_WARN
            1 pool(s) have no replicas configured

  services:
    mon:        3 daemons, quorum STR-01-BRK,STR-03-BRK,STR-02-BRK (age 2M)
    mgr:        STR-01-BRK.eppard(active, since 5M), standbys: STR-03-BRK.varwpu
    mds:        1/1 daemons up, 1 standby
    osd:        24 osds: 24 up (since 5M), 24 in (since 5M)
    rbd-mirror: 1 daemon active (1 hosts)

  data:
    volumes: 1/1 healthy
    pools:   8 pools, 1265 pgs
    objects: 329.79k objects, 882 GiB
    usage:   2.6 TiB used, 165 TiB / 168 TiB avail
    pgs:     1265 active+clean

  io:
    client:   3.7 MiB/s rd, 1.8 MiB/s wr, 234 op/s rd, 181 op/s wr
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx