Hi Mark, Thanks for posting these blogs. They are very interesting to read. Maybe you have an answer to a question I asked in the dev list: We run fio benchmark against a 3-node ceph cluster with 96 OSDs. Objects are 4kb. We use gdbpmp profiler https://github.com/markhpc/gdbpmp to analyze the threads' performance. we discovered the bstore_kv_sync thread is always busy, while all 16 tp_osd_tp threads are not busy most of the time (wait on a conditional variable or a lock). Given that 3 rocksdb CFs are sharded, and sharding is configurable, why not run multiple (3) bstore_kv_sync threads? they won't have conflicts most of the time. This has the potential of removing the rocksdb bottleneck and increasing IOPS. Can you explain this design choice? ________________________________ From: Mark Nelson <mnelson@xxxxxxxxxx> Sent: Tuesday, November 8, 2022 10:20 PM To: ceph-users@xxxxxxx <ceph-users@xxxxxxx> Subject: Recent ceph.io Performance Blog Posts CAUTION: External Sender Hi Folks, I thought I would mention that I've released a couple of performance articles on the Ceph blog recently that might be of interest to people: 1. https://ceph.io/en/news/blog/2022/rocksdb-tuning-deep-dive/ <https://ceph.io/en/news/blog/2022/rocksdb-tuning-deep-dive/> 2. https://ceph.io/en/news/blog/2022/qemu-kvm-tuning/ <https://ceph.io/en/news/blog/2022/qemu-kvm-tuning/> 3. https://ceph.io/en/news/blog/2022/ceph-osd-cpu-scaling/ <https://ceph.io/en/news/blog/2022/ceph-osd-cpu-scaling/> The first covers RocksDB tuning. How we arrived at our defaults, an analysis of some common settings that have been floating around on the mailing list, and potential new settings that we are considering making default in the future. The second covers how to tune QEMU/KVM with librbd to achieve high single-client performance on a small (30 OSD) NVMe backed cluster. This article also covers the performance impact when enabling 128bit AES over-the-wire encryption. The third covers per-OSD CPU/Core scaling and the kind of IOPS/core and IOPS/NVMe numbers that are achievable both on a single OSD and on a larger (60 OSD) NVMe cluster. In this case there are enough clients and a high enough per-client iodepth to saturate the OSD(s). I hope these are helpful or at least interesting! Thanks, Mark _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx