Re: requests are blocked > 32 sec woes

Gregory Farnum <greg@xxxxxxxxxxx> · Mon, 9 Feb 2015 07:20:48 -0800



There are a lot of next steps on
http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/

You probably want to look at the bits about using the admin socket,
and diagnosing slow requests. :)
-Greg

On Sun, Feb 8, 2015 at 8:48 PM, Matthew Monaco <matt@xxxxxxxxx> wrote:
> Hello!
>
> *** Shameless plug: Sage, I'm working with Dirk Grunwald on this cluster; I
> believe some of the members of your thesis committee were students of his =)
>
> We have a modest cluster at CU Boulder and are frequently plagued by "requests
> are blocked" issues. I'd greatly appreciate any insight or pointers. The issue
> is not specific to any one OSD; I'm pretty sure they've all showed up in ceph
> health detail at this point.
>
> We have 8 identical nodes:
>
>         - 5 * 1TB Seagate enterprise SAS drives
>           - btrfs
>         - 1 * Intel 480G S3500 SSD
>           - with 5*16G partitions as journals
>           - also hosting the OS, unfortunately
>         -  64G RAM
>         - 2 * Xeon E5-2630 v2
>           - So 24 hyperthreads @ 2.60 GHz
>         - 10G-ish IPoIB for networking
>
> So the cluster has 40TB over 40 OSDs total with a very straightforward crushmap.
> These nodes are also (unfortunately for the time being) OpenStack compute nodes
> and 99% of the usage is OpenStack volumes/images. I see a lot of kernel messages
> like:
>
>         ib_mthca 0000:02:00.0: Async event 16 for bogus QP 00dc0408
>
> which may or may not be correlated w/ the Ceph hangs.
>
> Other info: we have 3 mons on 3 of the 8 nodes listed above. The openstack
> volumes pool has 4096 pgs and is sized 3. This is probably too many PGs, but
> came from an initial misunderstanding of the formula in the documentation.
>
> Thanks,
> Matt
>
>
> PS - I'm trying to secure funds to get an additional 8 nodes with a little less
> RAM and CPU to move the OSDs to, with dual 10G Ethernet, and a SATA DOM for the
> OS so the SSD will be strictly journal. I may even be able to get an additional
> SSD or two per-node to use for caching or simply to set a higher primary affinity
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com