Re: Significant slowdown of osds since v0.67 Dumpling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey Mark and list,

FYI for you and the list: Samuel and I seem to have found and fixed the
remaining performance-problems.  For those who can't wait, fixes are in
"wip-dumpling-perf2" and will probably be in the next point-release.


   Regards,

     Oliver

On 27-08-13 17:13, Mark Nelson wrote:
> Ok, definitely let us know how it goes!  For what it's worth, I'm
> testing Sam's wip-dumpling-perf branch with the wbthrottle code disabled
> now and comparing it both to that same branch with it enabled along with
> 0.67.1.  Don't have any perf data, but quite a bit of other data to look
> through, both in terms of RADOS bench and RBD.
> 
> Mark
> 
> On 08/27/2013 10:07 AM, Oliver Daudey wrote:
>> Hey Mark,
>>
>> That will take a day or so for me to know with enough certainty.  With
>> the low CPU-usage and preliminary results today, I'm confident enough to
>> upgrade all OSDs in production and test the cluster all-Dumpling
>> tomorrow.  For now, I only upgraded a single OSD and measured CPU-usage
>> and whatever performance-effects that had on the cluster, so if I would
>> lose that OSD, I could recover. :-)  Will get back to you.
>>
>>
>>     Regards,
>>
>>         Oliver
>>
>> On 27-08-13 15:04, Mark Nelson wrote:
>>> Hi Olver/Matthew,
>>>
>>> Ignoring CPU usage, has speed remained slower as well?
>>>
>>> Mark
>>>
>>> On 08/27/2013 03:08 AM, Oliver Daudey wrote:
>>>> Hey Samuel,
>>>>
>>>> The "PGLog::check()" is now no longer visible in profiling, so it
>>>> helped
>>>> for that.  Unfortunately, it doesn't seem to have helped to bring down
>>>> the OSD's CPU-loading much.  Leveldb still uses much more than in
>>>> Cuttlefish.  On my test-cluster, I didn't notice any difference in the
>>>> RBD bench-results, either, so I have to assume that it didn't help
>>>> performance much.
>>>>
>>>> Here's the `perf top' I took just now on my production-cluster with
>>>> your
>>>> new version, under regular load.  Also note the "memcmp" and "memcpy",
>>>> which also don't show up when running a Cuttlefish-OSD:
>>>>    15.65%  [kernel]                     [k]
>>>> intel_idle
>>>>     7.20%  libleveldb.so.1.9            [.]
>>>> 0x3ceae
>>>>     6.28%  libc-2.11.3.so               [.]
>>>> memcmp
>>>>     5.22%  [kernel]                     [k]
>>>> find_busiest_group
>>>>     3.92%  kvm                          [.]
>>>> 0x2cf006
>>>>     2.40%  libleveldb.so.1.9            [.]
>>>> leveldb::InternalKeyComparator::Compar
>>>>     1.95%  [kernel]                     [k]
>>>> _raw_spin_lock
>>>>     1.69%  [kernel]                     [k]
>>>> default_send_IPI_mask_sequence_phys
>>>>     1.46%  libc-2.11.3.so               [.]
>>>> memcpy
>>>>     1.17%  libleveldb.so.1.9            [.]
>>>> leveldb::Block::Iter::Next()
>>>>     1.16%  [kernel]                     [k]
>>>> hrtimer_interrupt
>>>>     1.07%  [kernel]                     [k]
>>>> native_write_cr0
>>>>     1.01%  [kernel]                     [k]
>>>> __hrtimer_start_range_ns
>>>>     1.00%  [kernel]                     [k]
>>>> clockevents_program_event
>>>>     0.93%  [kernel]                     [k]
>>>> find_next_bit
>>>>     0.93%  libstdc++.so.6.0.13          [.]
>>>> std::string::_M_mutate(unsigned long,
>>>>     0.89%  [kernel]                     [k]
>>>> cpumask_next_and
>>>>     0.87%  [kernel]                     [k]
>>>> __schedule
>>>>     0.85%  [kernel]                     [k]
>>>> _raw_spin_unlock_irqrestore
>>>>     0.85%  [kernel]                     [k]
>>>> do_select
>>>>     0.84%  [kernel]                     [k]
>>>> apic_timer_interrupt
>>>>     0.80%  [kernel]                     [k]
>>>> fget_light
>>>>     0.79%  [kernel]                     [k]
>>>> native_write_msr_safe
>>>>     0.76%  [kernel]                     [k]
>>>> _raw_spin_lock_irqsave
>>>>     0.66%  libc-2.11.3.so               [.]
>>>> 0xdc6d8
>>>>     0.61%  libpthread-2.11.3.so         [.]
>>>> pthread_mutex_lock
>>>>     0.61%  [kernel]                     [k]
>>>> tg_load_down
>>>>     0.59%  [kernel]                     [k]
>>>> reschedule_interrupt
>>>>     0.59%  libsnappy.so.1.1.2           [.]
>>>> snappy::RawUncompress(snappy::Source*,
>>>>     0.56%  libstdc++.so.6.0.13          [.] std::string::append(char
>>>> const*, unsig
>>>>     0.54%  [kvm_intel]                  [k]
>>>> vmx_vcpu_run
>>>>     0.53%  [kernel]                     [k]
>>>> copy_user_generic_string
>>>>     0.53%  [kernel]                     [k]
>>>> load_balance
>>>>     0.50%  [kernel]                     [k]
>>>> rcu_needs_cpu
>>>>     0.45%  [kernel]                     [k] fput
>>>>
>>>>
>>>>      Regards,
>>>>
>>>>        Oliver
>>>>
>>>> On ma, 2013-08-26 at 23:33 -0700, Samuel Just wrote:
>>>>> I just pushed a patch to wip-dumpling-log-assert (based on current
>>>>> dumpling head).  I had disabled most of the code in PGLog::check() but
>>>>> left an (I thought) innocuous assert.  It seems that with (at least)
>>>>> g++ 4.6.3, stl list::size() is linear in the size of the list, so that
>>>>> assert actually traverses the pg log on each operation.  The patch in
>>>>> wip-dumpling-log-assert should disable that assert as well by default.
>>>>>    Let me know if it helps.
>>>>>
>>>>> It should be built within an hour of this email.
>>>>> -Sam
>>>>>
>>>>> On Mon, Aug 26, 2013 at 10:46 PM, Matthew Anderson
>>>>> <manderson8787@xxxxxxxxx> wrote:
>>>>>> Hi Guys,
>>>>>>
>>>>>> I'm having the same problem as Oliver with 0.67.2. CPU usage is
>>>>>> around
>>>>>> double that of the 0.61.8 OSD's in the same cluster which appears to
>>>>>> be causing the performance decrease.
>>>>>>
>>>>>> I did a perf comparison (not sure if I did it right but it seems ok).
>>>>>> Both hosts are the same spec running Ubuntu 12.04.1 (3.2 kernel),
>>>>>> journal and osd data is on an SSD, OSD's are in the same pool with
>>>>>> the
>>>>>> same weight and the perf tests were run at the same time on a
>>>>>> realworld load consisting of RBD traffic only.
>>>>>>
>>>>>> Dumpling -
>>>>>>
>>>>>> Events: 332K cycles
>>>>>>    17.93%  ceph-osd  libc-2.15.so           [.] 0x15d523
>>>>>>    17.03%  ceph-osd  ceph-osd               [.] 0x5c2897
>>>>>>     4.66%  ceph-osd  ceph-osd               [.]
>>>>>> leveldb::InternalKeyComparator::Compare(leveldb::Slice const&, level
>>>>>>     3.46%  ceph-osd  ceph-osd               [.]
>>>>>> leveldb::Block::Iter::Next()
>>>>>>     2.70%  ceph-osd  libstdc++.so.6.0.16    [.]
>>>>>> std::string::_M_mutate(unsigned long, unsigned long, unsigned long)
>>>>>>     2.60%  ceph-osd  ceph-osd               [.] PGLog::check()
>>>>>>     2.57%  ceph-osd  [kernel.kallsyms]      [k] __ticket_spin_lock
>>>>>>     2.49%  ceph-osd  ceph-osd               [.] ceph_crc32c_le_intel
>>>>>>     1.93%  ceph-osd  libsnappy.so.1.1.2     [.]
>>>>>> snappy::RawUncompress(snappy::Source*, char*)
>>>>>>     1.53%  ceph-osd  libstdc++.so.6.0.16    [.]
>>>>>> std::string::append(char
>>>>>> const*, unsigned long)
>>>>>>     1.47%  ceph-osd  libtcmalloc.so.0.1.0   [.] operator new(unsigned
>>>>>> long)
>>>>>>     1.33%  ceph-osd  [kernel.kallsyms]      [k]
>>>>>> copy_user_generic_string
>>>>>>     0.98%  ceph-osd  libtcmalloc.so.0.1.0   [.] operator
>>>>>> delete(void*)
>>>>>>     0.90%  ceph-osd  libstdc++.so.6.0.16    [.]
>>>>>> std::string::assign(char
>>>>>> const*, unsigned long)
>>>>>>     0.75%  ceph-osd  libstdc++.so.6.0.16    [.]
>>>>>> std::string::_M_replace_safe(unsigned long, unsigned long, char cons
>>>>>>     0.58%  ceph-osd  [kernel.kallsyms]      [k] wait_sb_inodes
>>>>>>     0.55%  ceph-osd  ceph-osd               [.]
>>>>>> leveldb::Block::Iter::Valid() const
>>>>>>     0.51%  ceph-osd  libtcmalloc.so.0.1.0   [.]
>>>>>> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::
>>>>>>     0.50%  ceph-osd  libtcmalloc.so.0.1.0   [.]
>>>>>> tcmalloc::CentralFreeList::FetchFromSpans()
>>>>>>     0.47%  ceph-osd  libstdc++.so.6.0.16    [.] 0x9ebc8
>>>>>>     0.46%  ceph-osd  libc-2.15.so           [.] vfprintf
>>>>>>     0.45%  ceph-osd  [kernel.kallsyms]      [k] find_busiest_group
>>>>>>     0.45%  ceph-osd  libstdc++.so.6.0.16    [.]
>>>>>> std::string::resize(unsigned long, char)
>>>>>>     0.43%  ceph-osd  libpthread-2.15.so     [.] pthread_mutex_unlock
>>>>>>     0.41%  ceph-osd  [kernel.kallsyms]      [k] iput_final
>>>>>>     0.40%  ceph-osd  ceph-osd               [.]
>>>>>> leveldb::Block::Iter::Seek(leveldb::Slice const&)
>>>>>>     0.39%  ceph-osd  libc-2.15.so           [.] _IO_vfscanf
>>>>>>     0.39%  ceph-osd  ceph-osd               [.]
>>>>>> leveldb::Block::Iter::key() const
>>>>>>     0.39%  ceph-osd  libtcmalloc.so.0.1.0   [.]
>>>>>> tcmalloc::CentralFreeList::ReleaseToSpans(void*)
>>>>>>     0.37%  ceph-osd  libstdc++.so.6.0.16    [.]
>>>>>> std::basic_ostream<char,
>>>>>> std::char_traits<char> >& std::__ostream_in
>>>>>>
>>>>>>
>>>>>> Cuttlefish -
>>>>>>
>>>>>> Events: 160K cycles
>>>>>>     7.53%  ceph-osd  [kernel.kallsyms]      [k] __ticket_spin_lock
>>>>>>     6.26%  ceph-osd  libc-2.15.so           [.] 0x89115
>>>>>>     3.06%  ceph-osd  ceph-osd               [.] ceph_crc32c_le
>>>>>>     2.66%  ceph-osd  libtcmalloc.so.0.1.0   [.] operator new(unsigned
>>>>>> long)
>>>>>>     2.46%  ceph-osd  [kernel.kallsyms]      [k] find_busiest_group
>>>>>>     1.80%  ceph-osd  libtcmalloc.so.0.1.0   [.] operator
>>>>>> delete(void*)
>>>>>>     1.42%  ceph-osd  [kernel.kallsyms]      [k] try_to_wake_up
>>>>>>     1.27%  ceph-osd  ceph-osd               [.] 0x531fb6
>>>>>>     1.21%  ceph-osd  libstdc++.so.6.0.16    [.] 0x9ebc8
>>>>>>     1.14%  ceph-osd  [kernel.kallsyms]      [k] wait_sb_inodes
>>>>>>     1.02%  ceph-osd  libc-2.15.so           [.] _IO_vfscanf
>>>>>>     1.01%  ceph-osd  [kernel.kallsyms]      [k] update_shares
>>>>>>     0.98%  ceph-osd  [kernel.kallsyms]      [k]
>>>>>> filemap_fdatawait_range
>>>>>>     0.90%  ceph-osd  libstdc++.so.6.0.16    [.]
>>>>>> std::basic_ostream<char,
>>>>>> std::char_traits<char> >& std
>>>>>>     0.89%  ceph-osd  [kernel.kallsyms]      [k] iput_final
>>>>>>     0.79%  ceph-osd  libstdc++.so.6.0.16    [.]
>>>>>> std::basic_string<char,
>>>>>> std::char_traits<char>, std::a
>>>>>>     0.79%  ceph-osd  [kernel.kallsyms]      [k]
>>>>>> copy_user_generic_string
>>>>>>     0.78%  ceph-osd  libc-2.15.so           [.] vfprintf
>>>>>>     0.70%  ceph-osd  libtcmalloc.so.0.1.0   [.]
>>>>>> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc:
>>>>>>     0.69%  ceph-osd  [kernel.kallsyms]      [k] __d_lookup_rcu
>>>>>>     0.69%  ceph-osd  libtcmalloc.so.0.1.0   [.]
>>>>>> tcmalloc::CentralFreeList::FetchFromSpans()
>>>>>>     0.66%  ceph-osd  [kernel.kallsyms]      [k] igrab
>>>>>>     0.63%  ceph-osd  [kernel.kallsyms]      [k] update_cfs_load
>>>>>>     0.63%  ceph-osd  [kernel.kallsyms]      [k] link_path_walk
>>>>>>
>>>>>> If you'd like some more tests run just let me know, more than happy
>>>>>> to help
>>>>>>
>>>>>> Thanks
>>>>>> -Matt
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux