Hi, I have a 3-OSD-node Ceph cluster with 1 480GB SSD and 8 x 2TB 12Gpbs SAS HDD on each node, to provide storage to a OpenStack cluster. Both public and cluster networks are 2x10G. WAL and DB of each OSD is on SSD and they share the same 60GB partition. I run fio with different combinations of operation, block size and io-depth to collect IOPS, bandwidth and latency. I tried fio on compute node with ioengine=rbd, also fio within VM (backed by Ceph) with ioengine=libaio. The result doesn't seem good. Here are couple examples. ==================================== fio --name=test --ioengine=rbd --clientname=admin \ --pool=benchmark --rbdname=test --numjobs=1 \ --runtime=30 --direct=1 --size=2G \ --rw=read --bs=4k --iodepth=1 test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=1 fio-3.7 Starting 1 process Jobs: 1 (f=0): [f(1)][100.0%][r=27.6MiB/s,w=0KiB/s][r=7075,w=0 IOPS][eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=56310: Mon Sep 14 19:01:24 2020 read: IOPS=7610, BW=29.7MiB/s (31.2MB/s)(892MiB/30001msec) slat (nsec): min=1550, max=57662, avg=3312.74, stdev=2981.42 clat (usec): min=77, max=4799, avg=127.39, stdev=39.88 lat (usec): min=78, max=4812, avg=130.70, stdev=40.67 clat percentiles (usec): | 1.00th=[ 82], 5.00th=[ 86], 10.00th=[ 95], 20.00th=[ 98], | 30.00th=[ 100], 40.00th=[ 104], 50.00th=[ 116], 60.00th=[ 129], | 70.00th=[ 141], 80.00th=[ 157], 90.00th=[ 182], 95.00th=[ 198], | 99.00th=[ 233], 99.50th=[ 245], 99.90th=[ 359], 99.95th=[ 515], | 99.99th=[ 709] bw ( KiB/s): min=27160, max=40696, per=100.00%, avg=30474.29, stdev=2826.23, samples=59 iops : min= 6790, max=10174, avg=7618.56, stdev=706.56, samples=59 lat (usec) : 100=28.89%, 250=70.72%, 500=0.34%, 750=0.05%, 1000=0.01% lat (msec) : 2=0.01%, 10=0.01% cpu : usr=3.55%, sys=3.80%, ctx=228358, majf=0, minf=29 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=228333,0,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=29.7MiB/s (31.2MB/s), 29.7MiB/s-29.7MiB/s (31.2MB/s-31.2MB/s), io=892MiB (935MB), run=30001-30001msec Disk stats (read/write): dm-0: ios=290/3, merge=0/0, ticks=2427/19, in_queue=2446, util=0.95%, aggrios=290/4, aggrmerge=0/0, aggrticks=2427/39, aggrin_queue=2332, aggrutil=0.95% sda: ios=290/4, merge=0/0, ticks=2427/39, in_queue=2332, util=0.95% ==================================== ==================================== fio --name=test --ioengine=rbd --clientname=admin \ --pool=benchmark --rbdname=test --numjobs=1 \ --runtime=30 --direct=1 --size=2G \ --rw=write --bs=4k --iodepth=1 test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=1 fio-3.7 Starting 1 process Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=6352KiB/s][r=0,w=1588 IOPS][eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=56544: Mon Sep 14 19:03:36 2020 write: IOPS=1604, BW=6417KiB/s (6571kB/s)(188MiB/30003msec) slat (nsec): min=2240, max=45925, avg=6526.95, stdev=3486.19 clat (usec): min=399, max=35411, avg=615.88, stdev=231.41 lat (usec): min=402, max=35421, avg=622.40, stdev=232.08 clat percentiles (usec): | 1.00th=[ 420], 5.00th=[ 449], 10.00th=[ 469], 20.00th=[ 498], | 30.00th=[ 529], 40.00th=[ 562], 50.00th=[ 611], 60.00th=[ 652], | 70.00th=[ 685], 80.00th=[ 709], 90.00th=[ 766], 95.00th=[ 799], | 99.00th=[ 881], 99.50th=[ 955], 99.90th=[ 2671], 99.95th=[ 3097], | 99.99th=[ 3785] bw ( KiB/s): min= 5944, max= 6792, per=100.00%, avg=6415.95, stdev=178.72, samples=60 iops : min= 1486, max= 1698, avg=1603.93, stdev=44.67, samples=60 lat (usec) : 500=20.82%, 750=67.23%, 1000=11.55% lat (msec) : 2=0.25%, 4=0.14%, 10=0.01%, 20=0.01%, 50=0.01% cpu : usr=1.22%, sys=1.25%, ctx=48143, majf=0, minf=18 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=0,48129,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: bw=6417KiB/s (6571kB/s), 6417KiB/s-6417KiB/s (6571kB/s-6571kB/s), io=188MiB (197MB), run=30003-30003msec Disk stats (read/write): dm-0: ios=31/2, merge=0/0, ticks=342/14, in_queue=356, util=0.12%, aggrios=33/3, aggrmerge=0/0, aggrticks=390/27, aggrin_queue=404, aggrutil=0.13% sda: ios=33/3, merge=0/0, ticks=390/27, in_queue=404, util=0.13% ==================================== Does that make sense? How do you benchmark your Ceph cluster? Appreciate if you could share your experiences here. Thanks! Tony _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx