Re: Significant slowdown of osds since v0.67 Dumpling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Aug 27, 2013, at 2:08, Oliver Daudey <oliver@xxxxxxxxx> wrote:

> Hey Samuel,
> 
> The "PGLog::check()" is now no longer visible in profiling, so it helped
> for that.  Unfortunately, it doesn't seem to have helped to bring down
> the OSD's CPU-loading much.  Leveldb still uses much more than in
> Cuttlefish.  On my test-cluster, I didn't notice any difference in the
> RBD bench-results, either, so I have to assume that it didn't help
> performance much.
> 
> Here's the `perf top' I took just now on my production-cluster with your
> new version, under regular load.  Also note the "memcmp" and "memcpy",
> which also don't show up when running a Cuttlefish-OSD:

memcpy is in fact also present in your Cuttlefish OSD, just a bit further down the list (increased from .7% to 1.4%). 

memcmp definitely looks suspicious and is something we're looking into.

> 15.65%  [kernel]                     [k]
> intel_idle                            
>  7.20%  libleveldb.so.1.9            [.]
> 0x3ceae                               
>  6.28%  libc-2.11.3.so               [.]
> memcmp                                
>  5.22%  [kernel]                     [k]
> find_busiest_group                    
>  3.92%  kvm                          [.]
> 0x2cf006                              
>  2.40%  libleveldb.so.1.9            [.]
> leveldb::InternalKeyComparator::Compar
>  1.95%  [kernel]                     [k]
> _raw_spin_lock                        
>  1.69%  [kernel]                     [k]
> default_send_IPI_mask_sequence_phys   
>  1.46%  libc-2.11.3.so               [.]
> memcpy                                
>  1.17%  libleveldb.so.1.9            [.]
> leveldb::Block::Iter::Next()          
>  1.16%  [kernel]                     [k]
> hrtimer_interrupt                     
>  1.07%  [kernel]                     [k]
> native_write_cr0                      
>  1.01%  [kernel]                     [k]
> __hrtimer_start_range_ns              
>  1.00%  [kernel]                     [k]
> clockevents_program_event             
>  0.93%  [kernel]                     [k]
> find_next_bit                         
>  0.93%  libstdc++.so.6.0.13          [.]
> std::string::_M_mutate(unsigned long, 
>  0.89%  [kernel]                     [k]
> cpumask_next_and                      
>  0.87%  [kernel]                     [k]
> __schedule                            
>  0.85%  [kernel]                     [k]
> _raw_spin_unlock_irqrestore           
>  0.85%  [kernel]                     [k]
> do_select                             
>  0.84%  [kernel]                     [k]
> apic_timer_interrupt                  
>  0.80%  [kernel]                     [k]
> fget_light                            
>  0.79%  [kernel]                     [k]
> native_write_msr_safe                 
>  0.76%  [kernel]                     [k]
> _raw_spin_lock_irqsave                
>  0.66%  libc-2.11.3.so               [.]
> 0xdc6d8                               
>  0.61%  libpthread-2.11.3.so         [.]
> pthread_mutex_lock                    
>  0.61%  [kernel]                     [k]
> tg_load_down                          
>  0.59%  [kernel]                     [k]
> reschedule_interrupt                  
>  0.59%  libsnappy.so.1.1.2           [.]
> snappy::RawUncompress(snappy::Source*,
>  0.56%  libstdc++.so.6.0.13          [.] std::string::append(char
> const*, unsig
>  0.54%  [kvm_intel]                  [k]
> vmx_vcpu_run                          
>  0.53%  [kernel]                     [k]
> copy_user_generic_string              
>  0.53%  [kernel]                     [k]
> load_balance                          
>  0.50%  [kernel]                     [k]
> rcu_needs_cpu                         
>  0.45%  [kernel]                     [k] fput
> 
> 
>   Regards,
> 
>     Oliver
> 
> On ma, 2013-08-26 at 23:33 -0700, Samuel Just wrote:
>> I just pushed a patch to wip-dumpling-log-assert (based on current
>> dumpling head).  I had disabled most of the code in PGLog::check() but
>> left an (I thought) innocuous assert.  It seems that with (at least)
>> g++ 4.6.3, stl list::size() is linear in the size of the list, so that
>> assert actually traverses the pg log on each operation.  The patch in
>> wip-dumpling-log-assert should disable that assert as well by default.
>> Let me know if it helps.
>> 
>> It should be built within an hour of this email.
>> -Sam
>> 
>> On Mon, Aug 26, 2013 at 10:46 PM, Matthew Anderson
>> <manderson8787@xxxxxxxxx> wrote:
>>> Hi Guys,
>>> 
>>> I'm having the same problem as Oliver with 0.67.2. CPU usage is around
>>> double that of the 0.61.8 OSD's in the same cluster which appears to
>>> be causing the performance decrease.
>>> 
>>> I did a perf comparison (not sure if I did it right but it seems ok).
>>> Both hosts are the same spec running Ubuntu 12.04.1 (3.2 kernel),
>>> journal and osd data is on an SSD, OSD's are in the same pool with the
>>> same weight and the perf tests were run at the same time on a
>>> realworld load consisting of RBD traffic only.
>>> 
>>> Dumpling -
>>> 
>>> Events: 332K cycles
>>> 17.93%  ceph-osd  libc-2.15.so           [.] 0x15d523
>>> 17.03%  ceph-osd  ceph-osd               [.] 0x5c2897
>>>  4.66%  ceph-osd  ceph-osd               [.]
>>> leveldb::InternalKeyComparator::Compare(leveldb::Slice const&, level
>>>  3.46%  ceph-osd  ceph-osd               [.] leveldb::Block::Iter::Next()
>>>  2.70%  ceph-osd  libstdc++.so.6.0.16    [.]
>>> std::string::_M_mutate(unsigned long, unsigned long, unsigned long)
>>>  2.60%  ceph-osd  ceph-osd               [.] PGLog::check()
>>>  2.57%  ceph-osd  [kernel.kallsyms]      [k] __ticket_spin_lock
>>>  2.49%  ceph-osd  ceph-osd               [.] ceph_crc32c_le_intel
>>>  1.93%  ceph-osd  libsnappy.so.1.1.2     [.]
>>> snappy::RawUncompress(snappy::Source*, char*)
>>>  1.53%  ceph-osd  libstdc++.so.6.0.16    [.] std::string::append(char
>>> const*, unsigned long)
>>>  1.47%  ceph-osd  libtcmalloc.so.0.1.0   [.] operator new(unsigned long)
>>>  1.33%  ceph-osd  [kernel.kallsyms]      [k] copy_user_generic_string
>>>  0.98%  ceph-osd  libtcmalloc.so.0.1.0   [.] operator delete(void*)
>>>  0.90%  ceph-osd  libstdc++.so.6.0.16    [.] std::string::assign(char
>>> const*, unsigned long)
>>>  0.75%  ceph-osd  libstdc++.so.6.0.16    [.]
>>> std::string::_M_replace_safe(unsigned long, unsigned long, char cons
>>>  0.58%  ceph-osd  [kernel.kallsyms]      [k] wait_sb_inodes
>>>  0.55%  ceph-osd  ceph-osd               [.]
>>> leveldb::Block::Iter::Valid() const
>>>  0.51%  ceph-osd  libtcmalloc.so.0.1.0   [.]
>>> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::
>>>  0.50%  ceph-osd  libtcmalloc.so.0.1.0   [.]
>>> tcmalloc::CentralFreeList::FetchFromSpans()
>>>  0.47%  ceph-osd  libstdc++.so.6.0.16    [.] 0x9ebc8
>>>  0.46%  ceph-osd  libc-2.15.so           [.] vfprintf
>>>  0.45%  ceph-osd  [kernel.kallsyms]      [k] find_busiest_group
>>>  0.45%  ceph-osd  libstdc++.so.6.0.16    [.]
>>> std::string::resize(unsigned long, char)
>>>  0.43%  ceph-osd  libpthread-2.15.so     [.] pthread_mutex_unlock
>>>  0.41%  ceph-osd  [kernel.kallsyms]      [k] iput_final
>>>  0.40%  ceph-osd  ceph-osd               [.]
>>> leveldb::Block::Iter::Seek(leveldb::Slice const&)
>>>  0.39%  ceph-osd  libc-2.15.so           [.] _IO_vfscanf
>>>  0.39%  ceph-osd  ceph-osd               [.] leveldb::Block::Iter::key() const
>>>  0.39%  ceph-osd  libtcmalloc.so.0.1.0   [.]
>>> tcmalloc::CentralFreeList::ReleaseToSpans(void*)
>>>  0.37%  ceph-osd  libstdc++.so.6.0.16    [.] std::basic_ostream<char,
>>> std::char_traits<char> >& std::__ostream_in
>>> 
>>> 
>>> Cuttlefish -
>>> 
>>> Events: 160K cycles
>>>  7.53%  ceph-osd  [kernel.kallsyms]      [k] __ticket_spin_lock
>>>  6.26%  ceph-osd  libc-2.15.so           [.] 0x89115
>>>  3.06%  ceph-osd  ceph-osd               [.] ceph_crc32c_le
>>>  2.66%  ceph-osd  libtcmalloc.so.0.1.0   [.] operator new(unsigned long)
>>>  2.46%  ceph-osd  [kernel.kallsyms]      [k] find_busiest_group
>>>  1.80%  ceph-osd  libtcmalloc.so.0.1.0   [.] operator delete(void*)
>>>  1.42%  ceph-osd  [kernel.kallsyms]      [k] try_to_wake_up
>>>  1.27%  ceph-osd  ceph-osd               [.] 0x531fb6
>>>  1.21%  ceph-osd  libstdc++.so.6.0.16    [.] 0x9ebc8
>>>  1.14%  ceph-osd  [kernel.kallsyms]      [k] wait_sb_inodes
>>>  1.02%  ceph-osd  libc-2.15.so           [.] _IO_vfscanf
>>>  1.01%  ceph-osd  [kernel.kallsyms]      [k] update_shares
>>>  0.98%  ceph-osd  [kernel.kallsyms]      [k] filemap_fdatawait_range
>>>  0.90%  ceph-osd  libstdc++.so.6.0.16    [.] std::basic_ostream<char,
>>> std::char_traits<char> >& std
>>>  0.89%  ceph-osd  [kernel.kallsyms]      [k] iput_final
>>>  0.79%  ceph-osd  libstdc++.so.6.0.16    [.] std::basic_string<char,
>>> std::char_traits<char>, std::a
>>>  0.79%  ceph-osd  [kernel.kallsyms]      [k] copy_user_generic_string
>>>  0.78%  ceph-osd  libc-2.15.so           [.] vfprintf
>>>  0.70%  ceph-osd  libtcmalloc.so.0.1.0   [.]
>>> tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc:
>>>  0.69%  ceph-osd  [kernel.kallsyms]      [k] __d_lookup_rcu
>>>  0.69%  ceph-osd  libtcmalloc.so.0.1.0   [.]
>>> tcmalloc::CentralFreeList::FetchFromSpans()
>>>  0.66%  ceph-osd  [kernel.kallsyms]      [k] igrab
>>>  0.63%  ceph-osd  [kernel.kallsyms]      [k] update_cfs_load
>>>  0.63%  ceph-osd  [kernel.kallsyms]      [k] link_path_walk
>>> 
>>> If you'd like some more tests run just let me know, more than happy to help
>>> 
>>> Thanks
>>> -Matt
> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux