Good afternoon everybody!

I have the following scenario:
Pool RBD replication x3
5 hosts with 12 SAS spinning disks each

I'm using exactly the following line with FIO to test:
fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4M -size=10G
-iodepth=16 -rw=write -filename=./test.img

If I increase the blocksize I can easily reach 1.5 GBps or more.

But when I use blocksize in 4K I get a measly 12 Megabytes per second,
which is quite annoying. I achieve the same rate if rw=read.

If I use librbd's cache I get a considerable improvement in writing, but
reading remains the same.

I already tested with rbd_read_from_replica_policy=balance but I didn't
notice any difference. I tried to leave readahead enabled by setting
rbd_readahead_disable_after_bytes=0 but I didn't see any difference in
sequential reading either.

Note: I tested it on another smaller cluster, with 36 SAS disks and got the
same result.

I don't know exactly what to look for or configure to have any improvement.
