Good day, rw operations (randwrite 4kB and 4MB) over mapped RBD are just too slow. I am also using librbd over TGT. fio input: [global] rw=randwrite ioengine=libaio iodepth=64 size=1g direct=1 buffered=0 startdelay=5 group_reporting=1 thread=1 ramp_time=5 time_based disk_util=0 clat_percentiles=0 disable_lat=1 disable_clat=1 disable_slat=1 #numjobs=16 runtime=60 filename=/mnt/disk/test1/testfile.fio [test] name=test bs=4k stonewall fio output for TGT (librbd): test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B- 4096B, ioengine=libaio, iodepth=64 Starting 1 thread test: Laying out IO files (2 files / total 1024MiB) test: (groupid=0, jobs=1): err= 0: pid=6909: Fri Jun 19 15:26:11 2020 write: IOPS=6342, BW=24.8MiB/s (25.0MB/s)(1487MiB/60003msec) bw ( KiB/s): min= 8, max=70216, per=100.00%, avg=30441.30, stdev= 28899.02, samples=100 iops : min= 2, max=17554, avg=7610.27, stdev=7224.76, samples= 100 cpu : usr=2.18%, sys=11.08%, ctx=107852, majf=0, minf=356 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64= 115.5% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64= 0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64= 0.0% issued rwts: total=0,380583,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=64 Run status group 0 (all jobs): WRITE: bw=24.8MiB/s (25.0MB/s), 24.8MiB/s-24.8MiB/s (25.0MB/s-25.0MB/s), io=1487MiB (1559MB), run=60003-60003msec ----------------- fio output for RBD: test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B- 4096B, ioengine=libaio, iodepth=64 Starting 1 thread test: (groupid=0, jobs=1): err= 0: pid=7372: Fri Jun 19 15:33:51 2020 write: IOPS=909, BW=3642KiB/s (3729kB/s)(214MiB/60186msec) bw ( KiB/s): min= 2792, max= 4688, per=100.00%, avg=3648.13, stdev= 399.09, samples=120 iops : min= 698, max= 1172, avg=912.01, stdev=99.75, samples=120 cpu : usr=0.78%, sys=3.08%, ctx=37108, majf=0, minf=267 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64= 110.2% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64= 0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64= 0.0% issued rwts: total=0,54732,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=64 Run status group 0 (all jobs): WRITE: bw=3642KiB/s (3729kB/s), 3642KiB/s-3642KiB/s (3729kB/s-3729kB/s), io=214MiB (224MB), run=60186-60186msec ----------------- I ran these tests from separate client server. I suspect that RBD is not working correctly in there, since I tried some fio tests on it and the result was almost the same with RBD cache set to false in ceph.conf (for example: fio -ioengine=libaio -name=test -bs=4M -iodepth=64 -numjobs=16 -rw= randwrite -direct=1 -runtime=60 -filename=/mnt/disk/test1 -size=10g). Can you give me any ideas where the problem might be, perhaps with RBD cache? Network capacity and usual things has been tested already. I will be able to provide more specs if needed. Thanks! _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx