Re: ceph-osd performance on ram disk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/09/2020 19:37, Mark Nelson wrote:
On 9/10/20 11:03 AM, George Shuklin wrote:

...
Are there any knobs to tweak to see higher performance for ceph-osd? I'm pretty sure it's not any kind of leveling, GC or other 'iops-related' issues (brd has performance of two order of magnitude higher).


So as you've seen, Ceph does a lot more than just write a chunk of data out to a block on disk.  There's tons of encoding/decoding happening, crc checksums, crush calculations, onode lookups, write-ahead-logging, and other work involved that all adds latency.  You can overcome some of that through parallelism, but 30K IOPs per OSD is probably pretty on-point for a nautilus era OSD.  For octopus+ the cache refactor in bluestore should get you farther (40-50k+ for and OSD in isolation).  The maximum performance we've seen in-house is around 70-80K IOPs on a single OSD using very fast NVMe and highly tuned settings.


A couple of things you can try:


- upgrade to octopus+ for the cache refactor

- Make sure you are using the equivalent of the latency-performance or latency-network tuned profile.  The most important part is disabling CPU cstate transitions.

- increase osd_memory_target if you have a larger dataset (onode cache misses in bluestore add a lot of latency)

- enable turbo if it's disabled (higher clock speed generally helps)


On the write path you are correct that there is a limitation regarding a single kv sync thread.  Over the years we've made this less of a bottleneck but it's possible you still could be hitting it.  In our test lab we've managed to utilize up to around 12-14 cores on a single OSD in isolation with 16 tp_osd_tp worker threads and on a larger cluster about 6-7 cores per OSD.  There's probably multiple factors at play, including context switching, cache thrashing, memory throughput, object creation/destruction, etc.  If you decide to look into it further you may want to try wallclock profiling the OSD under load and seeing where it is spending its time.

Thank you for feedback.

I forgot to mention this, it's Octopus, fresh installation.

I've disabled CSTATE (governor=performance), it make no difference - same iops, same CPU use by ceph-osd  I've just can't force Ceph to consume more than 330% of CPU. I can force read up to 150k IOPS (both network and local), hitting CPU limit, but write is somewhat restricted by ceph itself.

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux