Re: Significant slowdown of osds since v0.67 Dumpling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Can you attach a log from the startup of one of the dumpling osds on
your production machine (no need for logging, just need some of the
information dumped on every boot)?

libleveldb is leveldb.  We've used leveldb for a few things since
bobtail.  If anything, the load on leveldb should be lighter in
dumpling, I would think...  I'll have to try to reproduce it locally.
I'll keep you posted.
-Sam

On Sat, Aug 24, 2013 at 10:11 AM, Oliver Daudey <oliver@xxxxxxxxx> wrote:
> Hey Samuel,
>
> Unfortunately, disabling "wbthrottle" made almost no difference on my
> production-cluster.  OSD-load was still much higher on Dumpling.
>
> I've mentioned this several times already, but when profiling with `perf
> top' on my production-cluster, any time I'm running a Dumpling-OSD,
> several "libleveldb"-related entries come up near the top, that don't
> show up when running the Cuttlefish-OSD at all.  Let's concentrate on
> that for a moment, as it's a clearly visible difference on my
> production-cluster, which shows the actual problem.
>
> Dumpling OSDs:
>  17.23%  [kernel]                     [k] intel_idle
>   6.35%  [kernel]                     [k] find_busiest_group
>   4.36%  kvm                          [.] 0x2cdbb0
>   3.38%  libleveldb.so.1.9            [.] 0x22821
>   2.40%  libc-2.11.3.so               [.] memcmp
>   2.04%  ceph-osd                     [.] ceph_crc32c_le_intel
>   1.90%  [kernel]                     [k] _raw_spin_lock
>   1.87%  [kernel]                     [k] copy_user_generic_string
>   1.35%  [kernel]                     [k]
> default_send_IPI_mask_sequence_phys
>   1.34%  [kernel]                     [k] __hrtimer_start_range_ns
>   1.14%  libc-2.11.3.so               [.] memcpy
>   1.03%  [kernel]                     [k] hrtimer_interrupt
>   1.01%  [kernel]                     [k] do_select
>   1.00%  [kernel]                     [k] __schedule
>   0.99%  [kernel]                     [k] _raw_spin_unlock_irqrestore
>   0.97%  [kernel]                     [k] cpumask_next_and
>   0.97%  [kernel]                     [k] find_next_bit
>   0.96%  libleveldb.so.1.9            [.]
> leveldb::InternalKeyComparator::Compar
>   0.91%  [kernel]                     [k] _raw_spin_lock_irqsave
>   0.91%  [kernel]                     [k] fget_light
>   0.89%  [kernel]                     [k] clockevents_program_event
>   0.79%  [kernel]                     [k] sync_inodes_sb
>   0.78%  libleveldb.so.1.9            [.] leveldb::Block::Iter::Next()
>   0.75%  [kernel]                     [k] apic_timer_interrupt
>   0.70%  [kernel]                     [k] native_write_cr0
>   0.60%  [kvm_intel]                  [k] vmx_vcpu_run
>   0.58%  [kernel]                     [k] load_balance
>   0.57%  [kernel]                     [k] rcu_needs_cpu
>   0.56%  ceph-osd                     [.] PGLog::undirty()
>   0.51%  libpthread-2.11.3.so         [.] pthread_mutex_lock
>   0.50%  [vdso]                       [.] 0x7fff6dbff6ce
>
> Same load, but with Cuttlefish-OSDs:
>  19.23%  [kernel]                     [k] intel_idle
>   6.43%  [kernel]                     [k] find_busiest_group
>   5.25%  kvm                          [.] 0x152a75
>   2.70%  ceph-osd                     [.] ceph_crc32c_le
>   2.44%  [kernel]                     [k] _raw_spin_lock
>   1.95%  [kernel]                     [k] copy_user_generic_string
>   1.53%  [kernel]                     [k]
> default_send_IPI_mask_sequence_phys
>   1.28%  [kernel]                     [k] __hrtimer_start_range_ns
>   1.21%  [kernel]                     [k] do_select
>   1.19%  [kernel]                     [k] hrtimer_interrupt
>   1.19%  [kernel]                     [k] _raw_spin_unlock_irqrestore
>   1.16%  [kernel]                     [k] fget_light
>   1.12%  [kernel]                     [k] cpumask_next_and
>   1.11%  [kernel]                     [k] clockevents_program_event
>   1.08%  [kernel]                     [k] __schedule
>   1.08%  [kernel]                     [k] find_next_bit
>   0.99%  [kernel]                     [k] _raw_spin_lock_irqsave
>   0.90%  [kernel]                     [k] native_write_cr0
>   0.83%  [kernel]                     [k] native_write_msr_safe
>   0.82%  [kernel]                     [k] apic_timer_interrupt
>   0.70%  libc-2.11.3.so               [.] memcpy
>   0.68%  [kernel]                     [k] sync_inodes_sb
>   0.63%  [kernel]                     [k] tg_load_down
>   0.63%  [kernel]                     [k] load_balance
>   0.61%  libpthread-2.11.3.so         [.] pthread_mutex_lock
>   0.58%  [kernel]                     [k] rcu_needs_cpu
>   0.57%  [kernel]                     [k] fput
>   0.56%  libc-2.11.3.so               [.] 0x7fb29
>   0.54%  [vdso]                       [.] 0x7fff2afb873a
>   0.50%  [kernel]                     [k] iput
>   0.50%  [kernel]                     [k] reschedule_interrupt
>
> It seems to me like "libleveldb" is accounting for significant extra
> CPU-loading on Dumpling.  Another interesting fact, is that I only see
> it use so much CPU on my production-cluster, which is serving around 80
> KVMs over RBD, over 2x10Gbit/s ethernet per node and so has a much
> higher and more diverse load than what I can generate on my
> test-cluster.
>
> Any ideas about this particular one?  Did anything change between
> Cuttlefish and Dumpling that would account for higher CPU-usage by
> "libleveldb", particularly under this client-load?  Anything I can try
> to bring it down?  Thanks!
>
>
>    Regards,
>
>      Oliver
>
> On vr, 2013-08-23 at 13:55 -0700, Samuel Just wrote:
>> Ok, can you try setting filestore_op_threads to 1 on both cuttlefish
>> and wip-dumpling-perf (with and with wbthrottle, default wbthrottle
>> settings).  I suspect I created contention in the filestore op threads
>> (FileStore::lfn_open specifically), and if so setting it to only use 1
>> thread should even out the performance.
>> -Sam
>>
>> On Fri, Aug 23, 2013 at 1:44 PM, Oliver Daudey <oliver@xxxxxxxxx> wrote:
>> > Hey Samuel,
>> >
>> > I commented the earlier settings out, so it was with defaults.
>> >
>> >
>> >    Regards,
>> >
>> >       Oliver
>> >
>> > On vr, 2013-08-23 at 13:35 -0700, Samuel Just wrote:
>> >> When you were running with the wbthrottle on, did you have the
>> >> settings I gave you earlier set, or was it using the defaults?
>> >> -Sam
>> >>
>> >> On Fri, Aug 23, 2013 at 12:48 PM, Oliver Daudey <oliver@xxxxxxxxx> wrote:
>> >> > Hey Samuel,
>> >> >
>> >> > That changed something, for the better. :-)
>> >> >
>> >> > Your test-version, with wbthrottle off:
>> >> > # ceph-osd --version
>> >> > ceph version 0.67.1-18-g3fe3368
>> >> > (3fe3368ac7178dcd312e89d264d8d81307e582d8)
>> >> > # ceph --admin-daemon /var/run/ceph/ceph-osd.1.asok config show | grep
>> >> > wbthrottle_enable
>> >> >   "filestore_wbthrottle_enable": "false",
>> >> > # rbd bench-write test --io-pattern=rand
>> >> > bench-write  io_size 4096 io_threads 16 bytes 1073741824 pattern rand
>> >> >   SEC       OPS   OPS/SEC   BYTES/SEC
>> >> >     1       838    837.36  2283825.44
>> >> >     2      1679    827.66  2309860.21
>> >> >     3      2488    829.30  2320967.64
>> >> >     4      2812    655.55  1847713.58
>> >> >     5      3301    659.64  1849006.69
>> >> > <...>
>> >> >   120     54113    450.83  1271297.14
>> >> >   121     54847    453.13  1277957.21
>> >> >   122     55502    454.93  1281769.31
>> >> >   123     55768    451.09  1271512.03
>> >> >   124     55853    450.42  1269452.15
>> >> >   126     55889    442.22  1246494.89
>> >> >   127     56351    443.71  1250308.66
>> >> >   128     57135    446.36  1257746.91
>> >> >   129     57445    444.03  1251566.31
>> >> >   130     57944    445.72  1256426.99
>> >> >
>> >> > Now with the wbthrottle-option enabled:
>> >> > # ceph-osd --version
>> >> > ceph version 0.67.1-18-g3fe3368
>> >> > (3fe3368ac7178dcd312e89d264d8d81307e582d8)
>> >> > # ceph --admin-daemon /var/run/ceph/ceph-osd.1.asok config show | grep
>> >> > wbthrottle_enable
>> >> >   "filestore_wbthrottle_enable": "true",
>> >> > # rbd bench-write test --io-pattern=randbench-write  io_size 4096
>> >> > io_threads 16 bytes 1073741824 pattern rand
>> >> >   SEC       OPS   OPS/SEC   BYTES/SEC
>> >> >     1       822    821.23  2303875.45
>> >> >     2      1510    738.54  2021390.37
>> >> >     3      2224    740.29  1989216.54
>> >> >     4      2843    708.01  1912598.24
>> >> >     5      3294    652.35  1774048.26
>> >> > <...>
>> >> >   120     59470    495.58  1388776.41
>> >> >   122     60067    489.29  1370954.96
>> >> >   123     60083    487.74  1366434.49
>> >> >   124     60449    487.49  1366477.77
>> >> >   125     61122    488.97  1370480.73
>> >> >   126     61679    489.52  1372290.03
>> >> >   127     62195    489.69  1372523.64
>> >> >   128     62608    489.02  1370226.98
>> >> >   129     62655    485.68  1360719.66
>> >> >   130     62688    482.01  1350560.76
>> >> >
>> >> > Slightly faster than with wbthrottle disabled..  For comparison, one
>> >> > more time with Cuttlefish-OSDs:
>> >> > # ceph-osd --version
>> >> > ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff)
>> >> > # rbd bench-write test --io-pattern=rand
>> >> > bench-write  io_size 4096 io_threads 16 bytes 1073741824 pattern rand
>> >> >   SEC       OPS   OPS/SEC   BYTES/SEC
>> >> >     1      1074   1073.80  3079631.51
>> >> >     2      2121   1060.21  3001542.58
>> >> >     3      3111    845.59  2422586.94
>> >> >     4      3132    745.40  2132915.48
>> >> >     5      3336    665.33  1904205.15
>> >> > <...>
>> >> >   120     63700    530.81  1530767.77
>> >> >   121     64781    535.37  1544861.37
>> >> >   122     64944    532.10  1535398.54
>> >> >   123     65312    527.70  1522971.49
>> >> >   124     65526    528.42  1525134.71
>> >> >   125     66160    529.12  1527158.94
>> >> >   126     67214    533.44  1539955.47
>> >> >   127     67855    534.29  1541965.13
>> >> >   128     67939    528.79  1526270.55
>> >> >   129     68178    528.44  1525239.57
>> >> >   130     68811    529.24  1527681.84
>> >> >
>> >> > Cuttlefish still wins, but by a much smaller margin now.  Looks like
>> >> > we're onto something.  The fdatasync seems to be the key here, rather
>> >> > than disabling wbthrottle.
>> >> >
>> >> >
>> >> >    Regards,
>> >> >
>> >> >       Oliver
>> >> >
>> >> > On 23-08-13 19:53, Samuel Just wrote:
>> >> >> I pushed a branch, wip-dumpling-perf.  It does two things:
>> >> >> 1) adds a config filestore_wbthrottle_enable (defaults to true) to
>> >> >> allow you to disable the wbthrottle altogether
>> >> >> 2) causes the wbthrottle when enabled to fdatasync rather than fsync.
>> >> >>
>> >> >> Can you rerun the random workload with that branch with
>> >> >> filestore_wbthrottle_enable on and then off?
>> >> >> -Sam
>> >> >>
>> >> >> On Fri, Aug 23, 2013 at 10:26 AM, Oliver Daudey <oliver@xxxxxxxxx>
>> >> > wrote:
>> >> >>> Hey Sage,
>> >> >>>
>> >> >>> I'm all for it and will help testing.
>> >> >>>
>> >> >>>
>> >> >>>    Regards,
>> >> >>>
>> >> >>>       Oliver
>> >> >
>> >>
>> >
>> >
>>
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux