Hi,
during deployment of a small testcluster on two existing servers I
got multiple benchmark results I don't understand.
In summary, I think the rados benchmark values are okay for my
limited setup - I got ~80 MiB/s for write, ~300 MiB/s for reads. But
for rbd benchmarks, the values dropped down to ~46 MiB/s for writes
and 71 MiB/s for reads. Have I missed something for rbd bench? I
don't understand why rbd is so much slower than rados.
Hint: the productive cluster should get 3 servers total and each
server 6x 12TB SAS3 disks and maybe 2 additional servers for up to 5
monitors. I know Ceph makes more fun with > 3, but budget ist limited :(
Here are the results of all benches:
# rados bench -p testbench 10 write --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size
4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_virt-master4_1711455
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg
lat(s)
0 0 0 0 0 0 - 0
1 16 31 15 59.9956 60 0.430283
0.622336
2 16 51 35 69.9923 80 0.565152
0.635417
3 16 74 58 77.3241 92 0.746055
0.663046
4 16 97 81 80.99 92 0.558241
0.646831
5 16 122 106 84.7893 100 0.705829
0.686034
6 16 149 133 88.6553 108 0.496865
0.674404
7 16 170 154 87.9886 84 0.916755
0.674053
8 16 189 173 86.4887 76 4.24615 0.700781
9 16 215 199 88.4327 104 0.60869 0.702643
10 16 235 219 87.588 80 0.882353
0.696558
11 1 235 234 85.0792 60 0.666018
0.692935
Total time run: 11.8131
Total writes made: 235
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 79.5726
Stddev Bandwidth: 16.1087
Max bandwidth (MB/sec): 108
Min bandwidth (MB/sec): 60
Average IOPS: 19
Stddev IOPS: 4.02718
Max IOPS: 27
Min IOPS: 15
Average Latency(s): 0.708564
Stddev Latency(s): 0.440743
Max latency(s): 4.36566
Min latency(s): 0.318937
# rados bench -p testbench 10 seq
hints = 1
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg
lat(s)
0 0 0 0 0 0 - 0
1 16 87 71 283.925 284 0.354899
0.187907
2 16 161 145 289.941 296 0.122009
0.194255
3 5 235 230 306.611 340 0.231543
0.198982
Total time run: 3.08835
Total reads made: 235
Read size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 304.37
Average IOPS: 76
Stddev IOPS: 7.37111
Max IOPS: 85
Min IOPS: 71
Average Latency(s): 0.200212
Max latency(s): 1.87814
Min latency(s): 0.017529
# rbd bench --io-type write image01 --pool=testbench
bench type write io_size 4096 io_threads 16 bytes 1073741824 pattern
sequential
SEC OPS OPS/SEC BYTES/SEC
1 11520 11536.2 45 MiB/s
2 23984 12000.2 47 MiB/s
3 38304 12739.6 50 MiB/s
4 52672 13028.9 51 MiB/s
5 63616 12575.7 49 MiB/s
6 71456 11987.4 47 MiB/s
7 82464 11677.5 46 MiB/s
8 96080 11573.9 45 MiB/s
9 110080 11436.1 45 MiB/s
10 124512 12327.4 48 MiB/s
11 136496 13008.2 51 MiB/s
12 148160 13118.5 51 MiB/s
13 163216 13374 52 MiB/s
14 176848 13527 53 MiB/s
15 191120 13132.7 51 MiB/s
16 203376 13376.3 52 MiB/s
17 212576 12924.8 50 MiB/s
18 221120 11534.9 45 MiB/s
19 228032 10196.2 40 MiB/s
20 241584 10240.5 40 MiB/s
21 254176 10095.6 39 MiB/s
elapsed: 22 ops: 262144 ops/sec: 11896.4 bytes/sec: 46 MiB/s
~# rbd bench --io-type read image01 --pool=testbench
bench type read io_size 4096 io_threads 16 bytes 1073741824 pattern
sequential
SEC OPS OPS/SEC BYTES/SEC
1 18368 18384.4 72 MiB/s
2 36032 18024.3 70 MiB/s
3 56768 18928.4 74 MiB/s
4 75472 18872.4 74 MiB/s
5 93264 18656.4 73 MiB/s
6 109232 18173.1 71 MiB/s
7 128320 18458 72 MiB/s
8 146336 17913.9 70 MiB/s
9 164176 17741.1 69 MiB/s
10 184720 18291.6 71 MiB/s
11 202416 18637.2 73 MiB/s
12 216688 17673.9 69 MiB/s
13 236464 18025.9 70 MiB/s
14 255776 18320.4 72 MiB/s
elapsed: 14 ops: 262144 ops/sec: 18255.5 bytes/sec: 71 MiB/s
Some cluster details (per server):
* SAS3 SEAGATE ST12000NM004J
* Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
* Dual NIC Intel E810 (LACP, 2x10GBit Link)
* LSI SAS3004 PCI-Express Fusion-MPT SAS-3
* 192 GB ECC RAM
Ceph Squid deployed via cephadm.
~# ceph -s
cluster:
id: 7644057a-00f6-11f0-9a0c-eac00fed9338
health: HEALTH_WARN
1 pool(s) do not have an application enabled
services:
mon: 2 daemons, quorum virt-master4,virt-master3 (age 24h)
mgr: virt-master4.lddrxr(active, since 24h), standbys:
virt-master3.akkopo
osd: 2 osds: 2 up (since 3h), 2 in (since 3h)
data:
pools: 3 pools, 65 pgs
objects: 3.43k objects, 13 GiB
usage: 27 GiB used, 22 TiB / 22 TiB avail
pgs: 65 active+clean
~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 21.82808 root default
-5 10.91399 host virt-master3
1 hdd 10.91399 osd.1 up 1.00000 1.00000
-3 10.91409 host virt-master4
0 hdd 10.91409 osd.0 up 1.00000 1.00000
# ceph osd pool ls detail
pool 1 '.mgr' replicated size 2 min_size 1 crush_rule 0 object_hash
rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 17 flags
hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr
read_balance_score 2.00
pool 2 'kvm' replicated size 2 min_size 1 crush_rule 0 object_hash
rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 33 lfor
0/0/28 flags hashpspool,selfmanaged_snaps stripe_width 0
compression_algorithm snappy compression_mode aggressive application
rbd read_balance_score 1.06
pool 3 'testbench' replicated size 2 min_size 1 crush_rule 0
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on
last_change 446 lfor 0/443/441 flags hashpspool,selfmanaged_snaps
stripe_width 0 read_balance_score 1.06
osd.0 direct benchmark with 4M and 4K blocksize:
root@virt-master4:~# fio --ioengine=sync
--filename=/dev/disk/by-id/scsi-35000c500f8bfda83 --direct=1 --sync=1
--rw=write --bs=4M --numjobs=1 --iodepth=1 --runtime=60 --time_based
--name=fio
fio: (g=0): rw=write, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB,
(T) 4096KiB-4096KiB, ioengine=sync, iodepth=1
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][w=172MiB/s][w=43 IOPS][eta 00m:00s]
fio: (groupid=0, jobs=1): err= 0: pid=1585455: Sat Mar 15 14:29:21 2025
write: IOPS=42, BW=170MiB/s (178MB/s)(9.97GiB/60013msec); 0 zone
resets
clat (usec): min=21993, max=40219, avg=23274.37, stdev=963.46
lat (usec): min=22341, max=40453, avg=23511.77, stdev=961.26
clat percentiles (usec):
| 1.00th=[22152], 5.00th=[22152], 10.00th=[22414],
20.00th=[22676],
| 30.00th=[22938], 40.00th=[22938], 50.00th=[23200],
60.00th=[23200],
| 70.00th=[23462], 80.00th=[23462], 90.00th=[24249],
95.00th=[24511],
| 99.00th=[24773], 99.50th=[31327], 99.90th=[32637],
99.95th=[32637],
| 99.99th=[40109]
bw ( KiB/s): min=163840, max=180224, per=99.99%, avg=174166.05,
stdev=3764.97, samples=119
iops : min= 40, max= 44, avg=42.52, stdev= 0.92,
samples=119
lat (msec) : 50=100.00%
cpu : usr=1.00%, sys=2.05%, ctx=7648, majf=0, minf=11
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued rwts: total=0,2552,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=170MiB/s (178MB/s), 170MiB/s-170MiB/s (178MB/s-178MB/s),
io=9.97GiB (10.7GB), run=60013-60013msec
root@virt-master4:~# fio --ioengine=sync
--filename=/dev/disk/by-id/scsi-35000c500f8bfda83 --direct=1 --sync=1
--rw=randwrite --bs=4M --numjobs=1 --iodepth=1 --runtime=60
--time_based --name=fio
fio: (g=0): rw=randwrite, bs=(R) 4096KiB-4096KiB, (W)
4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=sync, iodepth=1
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [w(1)][100.0%][w=124MiB/s][w=31 IOPS][eta 00m:00s]
fio: (groupid=0, jobs=1): err= 0: pid=1588408: Sat Mar 15 14:37:13 2025
write: IOPS=30, BW=123MiB/s (129MB/s)(7360MiB/60028msec); 0 zone
resets
clat (usec): min=19173, max=77481, avg=32327.62, stdev=6604.03
lat (usec): min=19453, max=77684, avg=32615.39, stdev=6605.82
clat percentiles (usec):
| 1.00th=[21103], 5.00th=[23462], 10.00th=[25035],
20.00th=[26870],
| 30.00th=[28443], 40.00th=[30016], 50.00th=[31327],
60.00th=[32637],
| 70.00th=[34341], 80.00th=[36439], 90.00th=[41157],
95.00th=[45876],
| 99.00th=[51119], 99.50th=[53216], 99.90th=[69731],
99.95th=[77071],
| 99.99th=[77071]
bw ( KiB/s): min=65536, max=147456, per=99.99%, avg=125542.40,
stdev=9567.74, samples=120
iops : min= 16, max= 36, avg=30.65, stdev= 2.34,
samples=120
lat (msec) : 20=0.22%, 50=98.32%, 100=1.47%
cpu : usr=0.90%, sys=1.53%, ctx=5513, majf=0, minf=13
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued rwts: total=0,1840,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=123MiB/s (129MB/s), 123MiB/s-123MiB/s (129MB/s-129MB/s),
io=7360MiB (7718MB), run=60028-60028msec
root@virt-master4:~# fio --ioengine=io_uring
--filename=/dev/disk/by-id/scsi-35000c500f8bfda83 --direct=1 --sync=1
--rw=write --bs=4K --numjobs=1 --iodepth=1 --runtime=60 --time_based
--name=fio
fio: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=io_uring, iodepth=1
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][w=37.6MiB/s][w=9628 IOPS][eta 00m:00s]
fio: (groupid=0, jobs=1): err= 0: pid=1591836: Sat Mar 15 14:48:18 2025
write: IOPS=9560, BW=37.3MiB/s (39.2MB/s)(2241MiB/60001msec); 0
zone resets
slat (usec): min=3, max=242, avg= 5.75, stdev= 3.14
clat (usec): min=3, max=37068, avg=97.67, stdev=81.74
lat (usec): min=84, max=37086, avg=103.42, stdev=81.88
clat percentiles (usec):
| 1.00th=[ 87], 5.00th=[ 89], 10.00th=[ 90], 20.00th=[
92],
| 30.00th=[ 93], 40.00th=[ 94], 50.00th=[ 95], 60.00th=[
97],
| 70.00th=[ 98], 80.00th=[ 100], 90.00th=[ 104], 95.00th=[
109],
| 99.00th=[ 125], 99.50th=[ 159], 99.90th=[ 416], 99.95th=[
676],
| 99.99th=[ 889]
bw ( KiB/s): min=13600, max=39344, per=100.00%, avg=38312.13,
stdev=2316.38, samples=119
iops : min= 3400, max= 9836, avg=9578.03, stdev=579.09,
samples=119
lat (usec) : 4=0.01%, 20=0.01%, 50=0.01%, 100=80.54%, 250=19.17%
lat (usec) : 500=0.21%, 750=0.04%, 1000=0.03%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
cpu : usr=3.54%, sys=8.67%, ctx=573655, majf=0, minf=52
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued rwts: total=0,573639,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=37.3MiB/s (39.2MB/s), 37.3MiB/s-37.3MiB/s
(39.2MB/s-39.2MB/s), io=2241MiB (2350MB), run=60001-60001msec
--
Best regards / Mit freundlichen Grüßen
Daniel Vogelbacher
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx