Re: Deadly slow Ceph cluster revisited

Mark Nelson <mnelson@xxxxxxxxxx> · Fri, 17 Jul 2015 09:21:14 -0500

On 07/17/2015 08:38 AM, J David wrote:
This is the same cluster I posted about back in April.  Since then,
the situation has gotten significantly worse.

Here is what iostat looks like for the one active RBD image on this cluster:

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
vdb               0.00     0.00   14.10    0.00   685.65     0.00
97.26     3.43  299.40  299.40    0.00  70.92 100.00
vdb               0.00     0.00    1.10    0.00   140.80     0.00
256.00     3.00 2753.09 2753.09    0.00 909.09 100.00
vdb               0.00     0.00   17.40    0.00  2227.20     0.00
256.00     3.00  178.78  178.78    0.00  57.47 100.00
vdb               0.00     0.00    1.30    0.00   166.40     0.00
256.00     3.00 2256.62 2256.62    0.00 769.23 100.00
vdb               0.00     0.00    8.20    0.00  1049.60     0.00
256.00     3.00  362.10  362.10    0.00 121.95 100.00
vdb               0.00     0.00    1.10    0.00   140.80     0.00
256.00     3.00 2517.45 2517.45    0.00 909.45 100.04
vdb               0.00     0.00    1.10    0.00   140.66     0.00
256.00     3.00 2863.64 2863.64    0.00 909.09  99.90
vdb               0.00     0.00    0.70    0.00    89.60     0.00
256.00     3.00 3898.86 3898.86    0.00 1428.57 100.00
vdb               0.00     0.00    0.60    0.00    76.80     0.00
256.00     3.00 5093.33 5093.33    0.00 1666.67 100.00
vdb               0.00     0.00    1.20    0.00   153.60     0.00
256.00     3.00 2568.33 2568.33    0.00 833.33 100.00
vdb               0.00     0.00    1.30    0.00   166.40     0.00
256.00     3.00 2457.85 2457.85    0.00 769.23 100.00
vdb               0.00     0.00   13.90    0.00  1779.20     0.00
256.00     3.00  220.95  220.95    0.00  71.94 100.00
vdb               0.00     0.00    1.00    0.00   128.00     0.00
256.00     3.00 2250.40 2250.40    0.00 1000.00 100.00
vdb               0.00     0.00    1.30    0.00   166.40     0.00
256.00     3.00 2798.77 2798.77    0.00 769.23 100.00
vdb               0.00     0.00    0.90    0.00   115.20     0.00
256.00     3.00 3304.00 3304.00    0.00 1111.11 100.00
vdb               0.00     0.00    0.90    0.00   115.20     0.00
256.00     3.00 3425.33 3425.33    0.00 1111.11 100.00
vdb               0.00     0.00    1.30    0.00   166.40     0.00
256.00     3.00 2290.77 2290.77    0.00 769.23 100.00
vdb               0.00     0.00    4.30    0.00   550.40     0.00
256.00     3.00  721.30  721.30    0.00 232.56 100.00
vdb               0.00     0.00    1.60    0.00   204.80     0.00
256.00     3.00 1894.75 1894.75    0.00 625.00 100.00
vdb               0.00     0.00    1.20    0.00   153.60     0.00
256.00     3.00 2375.00 2375.00    0.00 833.33 100.00
vdb               0.00     0.00    0.90    0.00   115.20     0.00
256.00     3.00 3036.44 3036.44    0.00 1111.11 100.00
vdb               0.00     0.00    1.10    0.00   140.80     0.00
256.00     3.00 3086.18 3086.18    0.00 909.09 100.00
vdb               0.00     0.00    0.90    0.00   115.20     0.00
256.00     3.00 2480.44 2480.44    0.00 1111.11 100.00
vdb               0.00     0.00    1.20    0.00   153.60     0.00
256.00     3.00 3124.33 3124.33    0.00 833.67 100.04
vdb               0.00     0.00    0.80    0.00   102.40     0.00
256.00     3.00 3228.00 3228.00    0.00 1250.00 100.00
vdb               0.00     0.00    1.20    0.00   153.60     0.00
256.00     3.00 2439.33 2439.33    0.00 833.33 100.00
vdb               0.00     0.00    1.30    0.00   166.40     0.00
256.00     3.00 2567.08 2567.08    0.00 769.23 100.00
vdb               0.00     0.00    0.80    0.00   102.40     0.00
256.00     3.00 3023.00 3023.00    0.00 1250.00 100.00
vdb               0.00     0.00    4.80    0.00   614.40     0.00
256.00     3.00  712.50  712.50    0.00 208.33 100.00
vdb               0.00     0.00    1.30    0.00   118.75     0.00
182.69     3.00 2003.69 2003.69    0.00 769.23 100.00
vdb               0.00     0.00   10.50    0.00  1344.00     0.00
256.00     3.00  344.46  344.46    0.00  95.24 100.00

So, between 0 and 15 reads per second, no write activity, a constant
queue depth of 3+, wait times in seconds, and 100% I/O utilization,
all for read performance of 100-200K/sec.  Even trivial writes can
hang for 15-60 seconds before completing.

Sometimes this behavior will "go away" for awhile and it will go back
to what we saw in April: 50IOPS (read or write) and 5-20MB/sec of I/O
throughput.  But it always comes back.

The hardware of the ceph cluster is:
- Three ceph nodes
- Two of the ceph nodes have 64GiB RAM and 12 5TB SATA drives
- One of the ceph nodes has 32GiB RAM and 4 5TB SATA drives
- All ceph nodes have Intel E5-2609 v2 (2.50Ghz quad-core) CPUs
- Everything is 10GBase-T
- All three nodes running Ceph 0.80.9

The ceph hardware is all borderline idle.  The CPU is 3-5% utilized
and iostat reports the individual disks hover around 4-7% utilized at
any given time.  It does appear to be using most of the available RAM
for OSD caching.

The client is a KVM virtual machine running on a server by itself.
Inside the virtual machine it reports 100% CPU utilization by iowait.
Outside the virtual machine host, it reports everything is idle (99.1%
idle).

Something is *definitely* wrong.  Does anyone have any idea what it might be?

Thanks for any help with this!

Hi J David,

Forgive me if you covered this in April, but have you tried rados bench 
from the hypervisor (or another client node)?

Something like:

rados -p <pool> 30 bench write

just to see how it handles 4MB object writes.  You can play around with 
the -t and -b parameters to try different object workloads.  If rados 
bench is also terribly slow, then you might want to start looking for 
evidence of IO getting hung up on a specific disk or node.

Mark

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com