Ceph 16.x RBD significant performance degradation over time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I've been recently doing some extensive tests of Ceph powering block storage (RBD).

What I did is basically:
- create cluster with specific version
- apply the same config
- full prewrite 128 RBD images
- 3x run test suite (25x fio run with different block size and iodepth)

It turns out that RBD performance degrades significantly after 1 test suite for ceph 16.x. There is no such behavior on older versions.
Each fio runs for 15 mins, entire suite is 25 fio runs * 15 mins

Has anyone observed such a thing?

Comparison results are posted on github pages (see relative comparison tables)
https://brabiega.github.io/ceph/bench/14-2-16-eucephcom.html
https://brabiega.github.io/ceph/bench/14-2-22-eucephcom.html
https://brabiega.github.io/ceph/bench/15-2-16-eucephcom.html
https://brabiega.github.io/ceph/bench/16-2-5-eucephcom.html
https://brabiega.github.io/ceph/bench/16-2-7-eucephcom.html
https://brabiega.github.io/ceph/bench/16-2-7-znver2o2-minconf.html (minimal ceph.conf, only basic stuff to have working cluster)

All the results are here including another interesting finding (see 15.2.14 comparison)
https://brabiega.github.io/ceph/ceph.html


Here's my setup:

Hardware setup
--------------
3x backend servers
CPU: 2x AMD EPYC 7402 24-Core (48c+48t)
Storage: 24x NVMe
Network: 40gbps
OS: Ubuntu Focal
Kernel: 5.15.0-18-generic

4x client servers
CPU: 2x AMD EPYC 7402 24-Core (48c+48t)
Network: 40gbps
OS: Ubuntu Focal
Kernel: 5.11.0-37-generic

Software config
---------------
72 OSDs in total (24 OSDs per host)
1 OSD per NVMe drive
Each OSD runs in LXD container
Scrub disabled
Deep-scrub disabled
Ceph balancer off
1 pool 'rbd':
- 1024 PG
- PG autoscaler off

Test environment
----------------
- 128 rbd images (default features, size 128GB)
- All the images are fully written before any tests are done! (4194909 objects allocated)
- client version ceph 16.2.7 vanilla eu.ceph.com
- Each client runs fio with rbd engine (librbd) against 32 rbd images (4x32 in total)

BR
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx



[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux