I know. I tested fio before testing ceph with fio. On null ioengine fio can handle up to 14M IOPS (on my dusty lab's R220). On blk_null to gets down to 2.4-2.8M IOPS. On brd it drops to sad 700k IOPS. BTW, never run synthetic high-performance benchmarks on kvm. My old server with 'makelinuxfastagain' fixes make one io request in 3.4us, and on KVM VM it become 24us. Some guy said it got about 8.5us on vmware. It's all on purely software stack without any hypervisor IO. 24us sounds like a small number, but if your synthetics makes 200k iops, it's 4us. You can't make 200k on VM with 24us syscall time. On Thu, Sep 10, 2020, 22:49 Виталий Филиппов <vitalif@xxxxxxxxxx> wrote: > By the way, DON'T USE rados bench. It's an incorrect benchmark. ONLY use > fio > > 10 сентября 2020 г. 22:35:53 GMT+03:00, vitalif@xxxxxxxxxx пишет: >> >> Hi George >> >> Author of Ceph_performance here! :) >> >> I suspect you're running tests with 1 PG. Every PG's requests are always serialized, that's why OSD doesn't utilize all threads with 1 PG. You need something like 8 PGs per OSD. More than 8 usually doesn't improve results. >> >> Also note that read tests are meaningless after full overwrite on small OSDs because everything fits in cache. Restart the OSD to clear it. You can drop the cache via the admin socket too, but restarting is the simplest way. >> >> I've repeated your test with brd. My results with 8 PGs after filling the RBD image, turning CPU powersave off and restarting the OSD are: >> >> # fio -name=test -ioengine=rbd -bs=4k -iodepth=1 -rw=randread -pool=ramdisk -rbdname=testimg >> read: IOPS=3586, BW=14.0MiB/s (14.7MB/s)(411MiB/29315msec) >> lat (usec): min=182, max=5710, avg=277.41, stdev=90.16 >> >> # fio -name=test -ioengine=rbd -bs=4k -iodepth=1 -rw=randwrite -pool=ramdisk -rbdname=testimg >> write: IOPS=1247, BW=4991KiB/s (5111kB/s)(67.0MiB/13746msec); 0 zone resets >> lat (usec): min=555, max=4015, avg=799.45, stdev=142.92 >> >> # fio -name=test -ioengine=rbd -bs=4k -iodepth=128 -rw=randwrite -pool=ramdisk -rbdname=testimg >> write: IOPS=4138, BW=16.2MiB/s (16.9MB/s)(282MiB/17451msec); 0 zone resets >> 658% CPU >> >> # fio -name=test -ioengine=rbd -bs=4k -iodepth=128 -rw=randread -pool=ramdisk -rbdname=testimg >> read: IOPS=15.7k, BW=61.4MiB/s (64.4MB/s)(979MiB/15933msec) >> 540% CPU >> >> Basically the same shit as on an NVMe. So even an "in-memory Ceph" is slow, haha. >> >> Thank you! >>> >>> I know that article, but they promise 6 core use per OSD, and I got barely >>> over three, and all this in totally synthetic environment with no SDD to >>> blame (brd is more than fast and have a very consistent latency under any >>> kind of load). >>> >> ------------------------------ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> >> > -- > With best regards, > Vitaliy Filippov > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx