Re: librados (librbd) slower than krbd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Another big CPU drain is debug ms = 1. We recently decided to disable it by default in master since the overhead is so high. You can see that PR here:

https://github.com/ceph/ceph/pull/26936

Okaaay, thanks, after disabling it the latency difference between krbd and librbd slightly dropped, now it is like 0.57ms (krbd) vs 0.63ms (librbd) in my setup. It's becoming not bad overall since I'm approaching 0.5ms latency... :)

I also tried to make a patch for librados which makes it not recalculate PG OSDs for every operation, it also helps, but only slightly by reducing latency by 0.015ms :) (and probably only usable in small clusters with a small number of PGs).

I still can't really understand what's making librados so slow... Is it just the C++ code?.. :)

I can only see two things in valgrind profiles: "self" instruction count for buffer::list::append and friends are 24% and tcmalloc's are 15%. Crush calculation, which I had removed by caching it in my test, was taking 5.7% in that same profile, so... maybe if 5.7% stands for 0.015ms - could 24% stand for 0.075ms? :).

It seems buffer::list::append is called a lot of times, basically for each field of the output structure. Could it be better to allocate several fields at once and fill them by simple assignments or was I just digging in the wrong direction and most of the overhead originated from the copying of the original buffer (which is invisible in the profile)?

and the associated performance data:

https://docs.google.com/spreadsheets/d/1Zi3MFtvwLzCFfObL6evQKYtINQVQIjZ0SXczG78AnJM/edit?usp=sharing

Mark


--
With best regards,
  Vitaliy Filippov



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux