On 6/9/21 10:48 AM, Wido den Hollander wrote:
On 09/06/2021 14:33, Ilya Dryomov wrote:
On Wed, Jun 9, 2021 at 1:38 PM Wido den Hollander <wido@xxxxxxxx> wrote:
Hi,
While doing some benchmarks I have two identical Ceph clusters:
3x SuperMicro 1U
AMD Epyc 7302P 16C
256GB DDR
4x Samsung PM983 1,92TB
100Gbit networking
I tested on such a setup with v16.2.4 with fio:
bs=4k
qd=1
IOps: 695
That was very low as I was expecting at least >1000 IOps.
I checked with the second Ceph cluster which was still running v15.2.8,
the result: 1364 IOps.
I then upgraded from 15.2.8 to 15.2.13: 725 IOps
Looking at the differences between v15.2.8 and v15.2.8 of options.cc I
saw these options:
bluefs_buffered_io: false -> true
bluestore_cache_trim_max_skip_pinned: 1000 -> 64
The main difference seems to be 'bluefs_buffered_io', but in both cases
this was already explicitly set to 'true'.
So anything beyond 15.2.8 is right now giving me a much lower I/O
performance with Queue Depth = 1 and Block Size = 4k.
15.2.8: 1364 IOps
15.2.13: 725 IOps
16.2.4: 695 IOps
Has anybody else seen this as well? I'm trying to figure out where this
is going wrong.
Hi Wido,
Going by the subject, I assume these are rbd numbers? If so, did you
run any RADOS-level benchmarks?
Yes, rbd benchmark using fio.
$ rados -p rbd -t 1 -O 4096 -b 4096 bench 60 write
Average IOPS: 1024
Stddev IOPS: 29.6598
Max IOPS: 1072
Min IOPS: 918
Average Latency(s): 0.000974444
Stddev Latency(s): 0.000306557
So that seems kind of OK. Still roughly 1k IOps and a write latency of
~1ms.
But that was ~0.75ms when writing through RBD.
I now have a 16.2.4 and 15.2.13 cluster with identical hardware to run
some benchmarks on.
Wido
Good job narrowing it down so far. Are you testing with fio on a real
file system backed by RBD or librbd directly? It would be worth trying
librbd directly if possible, and also disable rbd_cache. Let's try to
get this as close to the OSD as possible.
The OSD has been a little tempermental lately when using it, but gdbpmp
(or Adam's wallclock profiler) might be helpful in figuring out what the
OSD is spending time on in both cases.
Mark
Thanks,
Ilya
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx