Re: Significant slowdown of osds since v0.67 Dumpling

Oliver Daudey <oliver@xxxxxxxxx> · Sat, 17 Aug 2013 23:48:46 +0200

Hey Mark,

On za, 2013-08-17 at 08:16 -0500, Mark Nelson wrote:
> On 08/17/2013 06:13 AM, Oliver Daudey wrote:
> > Hey all,
> >
> > This is a copy of Bug #6040 (http://tracker.ceph.com/issues/6040) I
> > created in the tracker.  Thought I would pass it through the list as
> > well, to get an idea if anyone else is running into it.  It may only
> > show under higher loads.  More info about my setup is in the bug-report
> > above.  Here goes:
> >
> >
> > I'm running a Ceph-cluster with 3 nodes, each of which runs a mon, osd
> > and mds. I'm using RBD on this cluster as storage for KVM, CephFS is
> > unused at this time. While still on v0.61.7 Cuttlefish, I got 70-100
> > +MB/sec on simple linear writes to a file with `dd' inside a VM on this
> > cluster under regular load and the osds usually averaged 20-100%
> > CPU-utilisation in `top'. After the upgrade to Dumpling, CPU-usage for
> > the osds shot up to 100% to 400% in `top' (multi-core system) and the
> > speed for my writes with `dd' inside a VM dropped to 20-40MB/sec. Users
> > complained that disk-access inside the VMs was significantly slower and
> > the backups of the RBD-store I was running, also got behind quickly.
> >
> > After downgrading only the osds to v0.61.7 Cuttlefish and leaving the
> > rest at 0.67 Dumpling, speed and load returned to normal. I have
> > repeated this performance-hit upon upgrade on a similar test-cluster
> > under no additional load at all. Although CPU-usage for the osds wasn't
> > as dramatic during these tests because there was no base-load from other
> > VMs, I/O-performance dropped significantly after upgrading during these
> > tests as well, and returned to normal after downgrading the osds.
> >
> > I'm not sure what to make of it. There are no visible errors in the logs
> > and everything runs and reports good health, it's just a lot slower,
> > with a lot more CPU-usage.
> 
> Hi Oliver,
> 
> If you have access to the perf command on this system, could you try 
> running:
> 
> "sudo perf top"
> 
> And if that doesn't give you much,
> 
> "sudo perf record -g"
> 
> then:
> 
> "sudo perf report | less"
> 
> during the period of high CPU usage?  This will give you a call graph. 
> There may be symbols missing, but it might help track down what the OSDs 
> are doing.

Thanks for your help!  I did a couple of runs on my test-cluster,
loading it with writes from 3 VMs concurrently and measuring the results
at the first node with all 0.67 Dumpling-components and with the osds
replaced by 0.61.7 Cuttlefish.  I let `perf top' run and settle for a
while, then copied anything that showed in red and green into this post.
Here are the results (sorry for the word-wraps):

First, with 0.61.7 osds:

 19.91%  [kernel]                    [k] intel_idle
 10.18%  [kernel]                    [k] _raw_spin_lock_irqsave
  6.79%  ceph-osd                    [.] ceph_crc32c_le
  4.93%  [kernel]                    [k]
default_send_IPI_mask_sequence_phys
  2.71%  [kernel]                    [k] copy_user_generic_string
  1.42%  libc-2.11.3.so              [.] memcpy
  1.23%  [kernel]                    [k] find_busiest_group
  1.13%  librados.so.2.0.0           [.] ceph_crc32c_le_intel
  1.11%  [kernel]                    [k] _raw_spin_lock
  0.99%  kvm                         [.] 0x1931f8
  0.92%  [igb]                       [k] igb_poll
  0.87%  [kernel]                    [k] native_write_cr0
  0.80%  [kernel]                    [k] csum_partial
  0.78%  [kernel]                    [k] __do_softirq
  0.63%  [kernel]                    [k] hpet_legacy_next_event
  0.53%  [ip_tables]                 [k] ipt_do_table
  0.50%  libc-2.11.3.so              [.] 0x74433

Second test, with 0.67 osds:

 18.32%  [kernel]                      [k] intel_idle
  7.58%  [kernel]                      [k] _raw_spin_lock_irqsave
  7.04%  ceph-osd                      [.] PGLog::undirty()
  4.39%  ceph-osd                      [.] ceph_crc32c_le_intel
  3.92%  [kernel]                      [k]
default_send_IPI_mask_sequence_phys
  2.25%  [kernel]                      [k] copy_user_generic_string
  1.76%  libc-2.11.3.so                [.] memcpy
  1.56%  librados.so.2.0.0             [.] ceph_crc32c_le_intel
  1.40%  libc-2.11.3.so                [.] vfprintf
  1.12%  libc-2.11.3.so                [.] 0x7217b
  1.05%  [kernel]                      [k] _raw_spin_lock
  1.01%  [kernel]                      [k] find_busiest_group
  0.83%  kvm                           [.] 0x193ab8
  0.80%  [kernel]                      [k] native_write_cr0
  0.76%  [kernel]                      [k] __do_softirq
  0.73%  libc-2.11.3.so                [.] _IO_default_xsputn
  0.70%  [kernel]                      [k] csum_partial
  0.68%  [igb]                         [k] igb_poll
  0.58%  [kernel]                      [k] hpet_legacy_next_event
  0.54%  [kernel]                      [k] __schedule

What jumps right out, is the "PGLog::undirty()", which doesn't show up
with 0.61.7 at all, but is an extra drag right at top-usage in 0.67.
Note that I didn't manage to fully load the test-cluster CPU-wise,
because of network-constraints and I don't want to take any extra risks
on the production-cluster and test it there, but it seems we found a
possible culprit.

Any ideas?  Thanks again!

   Regards,

      Oliver

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com