Re: Btrfs slowdown

Sage Weil <sage@xxxxxxxxxxxx> · Mon, 8 Aug 2011 14:58:29 -0700 (PDT)

Hi Christian,

Are you still seeing this slowness?

sage

On Wed, 27 Jul 2011, Christian Brunner wrote:
> 2011/7/25 Chris Mason <chris.mason@xxxxxxxxxx>:
> > Excerpts from Christian Brunner's message of 2011-07-25 03:54:47 -0400:
> >> Hi,
> >>
> >> we are running a ceph cluster with btrfs as it's base filesystem
> >> (kernel 3.0). At the beginning everything worked very well, but after
> >> a few days (2-3) things are getting very slow.
> >>
> >> When I look at the object store servers I see heavy disk-i/o on the
> >> btrfs filesystems (disk utilization is between 60% and 100%). I also
> >> did some tracing on the Cepp-Object-Store-Daemon, but I'm quite
> >> certain, that the majority of the disk I/O is not caused by ceph or
> >> any other userland process.
> >>
> >> When reboot the system(s) the problems go away for another 2-3 days,
> >> but after that, it starts again. I'm not sure if the problem is
> >> related to the kernel warning I've reported last week. At least there
> >> is no temporal relationship between the warning and the slowdown.
> >>
> >> Any hints on how to trace this would be welcome.
> >
> > The easiest way to trace this is with latencytop.
> >
> > Apply this patch:
> >
> > http://oss.oracle.com/~mason/latencytop.patch
> >
> > And then use latencytop -c for a few minutes while the system is slow.
> > Send the output here and hopefully we'll be able to figure it out.
> 
> I've now installed latencytop. Attached are two output files: The
> first is from yesterday and was created aproxematly half an hour after
> the boot. The second on is from today, uptime is 19h. The load on the
> system is already rising. Disk utilization is approximately at 50%.
> 
> Thanks for your help.
> 
> Christian
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html