Please do!
Mark
On 11/19/2014 01:29 AM, Alexandre DERUMIER wrote:
Hi,
Can I make a tracker for this ?
----- Mail original -----
De: "Haomai Wang" <haomaiwang@xxxxxxxxx>
À: "Mark Nelson" <mark.nelson@xxxxxxxxxxx>
Cc: "Sage Weil" <sage@xxxxxxxxxxxx>, "Alexandre DERUMIER" <aderumier@xxxxxxxxx>, "Somnath Roy" <somnath.roy@xxxxxxxxxxx>, "Ceph Devel" <ceph-devel@xxxxxxxxxxxxxxx>
Envoyé: Jeudi 13 Novembre 2014 19:15:24
Objet: Re: client cpu usage : kbrd vs librbd perf report
Hmm, I think it's a good perf topic to discuss about buffer
alloc/dealloc. For example, maybe frequency alloced object can use
memory pool(each pool stores the same objects), but the most challenge
to this is also STL structures.
On Fri, Nov 14, 2014 at 1:05 AM, Mark Nelson <mark.nelson@xxxxxxxxxxx> wrote:
On 11/13/2014 10:29 AM, Sage Weil wrote:
On Thu, 13 Nov 2014, Alexandre DERUMIER wrote:
I think we need to figure out why so much time is being spent
mallocing/freeing memory. Got to get those symbols resolved!
Ok, I don't known why, but if I remove all ceph -dbg packages, I'm seeing
the rbd && rados symbols now...
I have udpdate the files:
http://odisoweb1.odiso.net/cephperf/perf-librbd/report.txt
Ran it through c++filt:
https://gist.github.com/88ba9409f5d201b957a1
I'm a bit suprised by the some of the items near the top
(bufferlist.clear() callers). I'm sure several of those can be
streamlined to avoid temporary bufferlists. I don't see any super
egregious users of the allocator, though.
The memcpy callers might be a good place to start...
sage
Wasn't josh looking into some of this a year ago? Did anything ever come of
that work?
----- Mail original -----
De: "Mark Nelson" <mark.nelson@xxxxxxxxxxx>
?: "Alexandre DERUMIER" <aderumier@xxxxxxxxx>, "Ceph Devel"
<ceph-devel@xxxxxxxxxxxxxxx>
Cc: "Mark Nelson" <mark.nelson@xxxxxxxxxxx>, "Sage Weil"
<sweil@xxxxxxxxxx>, "Somnath Roy" <somnath.roy@xxxxxxxxxxx>
Envoy?: Jeudi 13 Novembre 2014 15:20:40
Objet: Re: client cpu usage : kbrd vs librbd perf report
On 11/13/2014 05:15 AM, Alexandre DERUMIER wrote:
Hi,
I have redone perf with dwarf
perf record -g --call-graph dwarf -a -F 99 -- sleep 60
I have put perf reports, ceph conf, fio config here:
http://odisoweb1.odiso.net/cephperf/
test setup
-----------
client cpu config : 8 x Intel(R) Xeon(R) CPU E5-2603 v2 @ 1.80GHz
ceph cluster : 3 nodes (same cpu than client) with 2 osd each (intel ssd
s3500), test pool with replication x1
rbd volume size : 10G (almost all reads are done in osd buffer cache)
benchmark with fio 4k randread, with 1 rbd volume. (also tested with 20
rbd volumes, results are equals).
debian wheezy - kernel 3.17 - and ceph packages from master on
gitbuilder
(BTW, I have installed librbd/rados dbg packages but I have missing
symbols ?)
I think if you run perf report with verbose enabled it will tell you
which symbols are missing:
perf report -v 2>&1 | less
If you have them but it's not detecting them properly you can clean out
the cache or even manually reassign the symbols but it's annoying.
Global results:
---------------
librbd : 60000iops : 98% cpu
krbd : 90000iops : 32% cpu
So, librbd usage is 4,5x more than krbd for same ios throughput
The difference seem to be quite huge, is it expected ?
This is kind of the wild west. With that many IOPS we are running into
new bottlenecks. :)
librbd perf report:
-------------------------
top cpu usage
--------------
25.71% fio libc-2.13.so
17.69% fio librados.so.2.0.0
12.38% fio librbd.so.1.0.0
27.99% fio [kernel.kallsyms]
4.19% fio libpthread-2.13.so
libc-2.13.so (seem that malloc/free use a lot of cpu here)
------------
21.05%-- _int_malloc
14.36%-- free
13.66%-- malloc
9.89%-- __lll_unlock_wake_private
5.35%-- __clone
4.38%-- __poll
3.77%-- __memcpy_ssse3
1.64%-- vfprintf
1.02%-- arena_get2
I think we need to figure out why so much time is being spent
mallocing/freeing memory. Got to get those symbols resolved!
fio [kernel.kallsyms] : seem to have a lot of futex functions here
-----------------------
5.27%-- _raw_spin_lock
3.88%-- futex_wake
2.88%-- __switch_to
2.74%-- system_call
2.70%-- __schedule
2.52%-- tcp_sendmsg
2.47%-- futex_wait_setup
2.28%-- _raw_spin_lock_irqsave
2.16%-- idle_cpu
1.66%-- enqueue_task_fair
1.57%-- native_write_msr_safe
1.49%-- hash_futex
1.46%-- futex_wait
1.40%-- reschedule_interrupt
1.37%-- try_to_wake_up
1.28%-- account_entity_enqueue
1.25%-- copy_user_enhanced_fast_string
1.25%-- futex_requeue
1.24%-- __fget
1.24%-- update_curr
1.20%-- tcp_write_xmit
1.14%-- wake_futex
1.08%-- scheduler_ipi
1.05%-- select_task_rq_fair
1.01%-- dequeue_task_fair
0.97%-- do_futex
0.97%-- futex_wait_queue_me
0.83%-- cpuacct_charge
0.82%-- tcp_transmit_skb
...
Regards,
Alexandre
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html