Re: slow request and unresponsive kvm guests after upgrading ceph cluster and os, please help debugging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We've also seen some problems with FileStore on newer kernels; 4.9 is the last kernel that worked reliably with FileStore in my experience.

But I haven't seen problems with BlueStore related to the kernel version (well, except for that scrub bug, but my work-around for that is in all release versions).

Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


On Mon, Jan 6, 2020 at 8:44 PM Jelle de Jong <jelledejong@xxxxxxxxxxxxx> wrote:
Hello everybody,

I have issues with very slow requests a simple tree node cluster here,
four WDC enterprise disks and Intel Optane NVMe journal on identical
high memory nodes, with 10GB networking.

It was working all good with Ceph Hammer on Debian Wheezy, but I wanted
to upgrade to a supported version and test out bluestore as well. So I
upgraded to luminous on Debian Stretch and used ceph-volume to create
bluestore osds, everything went downhill from there.

I went back to filestore on all nodes but I still have slow requests and
I can not pinpoint a good reason I tried to debug and gathered
information to look at:

https://paste.debian.net/hidden/acc5d204/

First I thought it was the balancing that was making things slow, then I
thought it might be the LVM layer, so I recreated the nodes without LVM
by switching from ceph-volume to ceph-disk, no different still slow
request. Then I changed back from bluestore to filestore but still the a
very slow cluster. Then I thought it was a CPU scheduling issue and
downgraded the 5.x kernel and CPU performance is full speed again. I
thought maybe there is something weird with an osd and taking them out
one by one, but slow request are still showing up and client performance
from vms is really poor.

I just feel a burst of small requests keeps blocking for a while then
recovers again.

Many thanks for helping out looking at the URL.

If there are options which I should tune for a hdd with nvme journal
setup please share.

Jelle
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux