Re: client cpu usage : kbrd vs librbd perf report

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 13 Nov 2014, Alexandre DERUMIER wrote:
> >>I think we need to figure out why so much time is being spent 
> >>mallocing/freeing memory. Got to get those symbols resolved! 
> 
> Ok, I don't known why, but if I remove all ceph -dbg packages, I'm seeing the rbd && rados symbols now...
> 
> I have udpdate the files:
> 
> http://odisoweb1.odiso.net/cephperf/perf-librbd/report.txt

Ran it through c++filt:

	https://gist.github.com/88ba9409f5d201b957a1

I'm a bit suprised by the some of the items near the top 
(bufferlist.clear() callers).  I'm sure several of those can be 
streamlined to avoid temporary bufferlists.  I don't see any super 
egregious users of the allocator, though.

The memcpy callers might be a good place to start...

sage





> 
> 
> 
> 
> ----- Mail original ----- 
> 
> De: "Mark Nelson" <mark.nelson@xxxxxxxxxxx> 
> ?: "Alexandre DERUMIER" <aderumier@xxxxxxxxx>, "Ceph Devel" <ceph-devel@xxxxxxxxxxxxxxx> 
> Cc: "Mark Nelson" <mark.nelson@xxxxxxxxxxx>, "Sage Weil" <sweil@xxxxxxxxxx>, "Somnath Roy" <somnath.roy@xxxxxxxxxxx> 
> Envoy?: Jeudi 13 Novembre 2014 15:20:40 
> Objet: Re: client cpu usage : kbrd vs librbd perf report 
> 
> On 11/13/2014 05:15 AM, Alexandre DERUMIER wrote: 
> > Hi, 
> > 
> > I have redone perf with dwarf 
> > 
> > perf record -g --call-graph dwarf -a -F 99 -- sleep 60 
> > 
> > I have put perf reports, ceph conf, fio config here: 
> > 
> > http://odisoweb1.odiso.net/cephperf/ 
> > 
> > test setup 
> > ----------- 
> > client cpu config : 8 x Intel(R) Xeon(R) CPU E5-2603 v2 @ 1.80GHz 
> > ceph cluster : 3 nodes (same cpu than client) with 2 osd each (intel ssd s3500), test pool with replication x1 
> > rbd volume size : 10G (almost all reads are done in osd buffer cache) 
> > 
> > benchmark with fio 4k randread, with 1 rbd volume. (also tested with 20 rbd volumes, results are equals). 
> > debian wheezy - kernel 3.17 - and ceph packages from master on gitbuilder 
> > 
> > (BTW, I have installed librbd/rados dbg packages but I have missing symbols ?) 
> 
> I think if you run perf report with verbose enabled it will tell you 
> which symbols are missing: 
> 
> perf report -v 2>&1 | less 
> 
> If you have them but it's not detecting them properly you can clean out 
> the cache or even manually reassign the symbols but it's annoying. 
> 
> > 
> > 
> > 
> > Global results: 
> > --------------- 
> > librbd : 60000iops : 98% cpu 
> > krbd : 90000iops : 32% cpu 
> > 
> > 
> > So, librbd usage is 4,5x more than krbd for same ios throughput 
> > 
> > The difference seem to be quite huge, is it expected ? 
> 
> This is kind of the wild west. With that many IOPS we are running into 
> new bottlenecks. :) 
> 
> > 
> > 
> > 
> > 
> > librbd perf report: 
> > ------------------------- 
> > top cpu usage 
> > -------------- 
> > 25.71% fio libc-2.13.so 
> > 17.69% fio librados.so.2.0.0 
> > 12.38% fio librbd.so.1.0.0 
> > 27.99% fio [kernel.kallsyms] 
> > 4.19% fio libpthread-2.13.so 
> > 
> > 
> > libc-2.13.so (seem that malloc/free use a lot of cpu here) 
> > ------------ 
> > 21.05%-- _int_malloc 
> > 14.36%-- free 
> > 13.66%-- malloc 
> > 9.89%-- __lll_unlock_wake_private 
> > 5.35%-- __clone 
> > 4.38%-- __poll 
> > 3.77%-- __memcpy_ssse3 
> > 1.64%-- vfprintf 
> > 1.02%-- arena_get2 
> > 
> 
> I think we need to figure out why so much time is being spent 
> mallocing/freeing memory. Got to get those symbols resolved! 
> 
> > fio [kernel.kallsyms] : seem to have a lot of futex functions here 
> > ----------------------- 
> > 5.27%-- _raw_spin_lock 
> > 3.88%-- futex_wake 
> > 2.88%-- __switch_to 
> > 2.74%-- system_call 
> > 2.70%-- __schedule 
> > 2.52%-- tcp_sendmsg 
> > 2.47%-- futex_wait_setup 
> > 2.28%-- _raw_spin_lock_irqsave 
> > 2.16%-- idle_cpu 
> > 1.66%-- enqueue_task_fair 
> > 1.57%-- native_write_msr_safe 
> > 1.49%-- hash_futex 
> > 1.46%-- futex_wait 
> > 1.40%-- reschedule_interrupt 
> > 1.37%-- try_to_wake_up 
> > 1.28%-- account_entity_enqueue 
> > 1.25%-- copy_user_enhanced_fast_string 
> > 1.25%-- futex_requeue 
> > 1.24%-- __fget 
> > 1.24%-- update_curr 
> > 1.20%-- tcp_write_xmit 
> > 1.14%-- wake_futex 
> > 1.08%-- scheduler_ipi 
> > 1.05%-- select_task_rq_fair 
> > 1.01%-- dequeue_task_fair 
> > 0.97%-- do_futex 
> > 0.97%-- futex_wait_queue_me 
> > 0.83%-- cpuacct_charge 
> > 0.82%-- tcp_transmit_skb 
> > ... 
> > 
> > 
> > Regards, 
> > 
> > Alexandre 
> > 
> > 
> > 
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux