"Requests are blocked > 32 sec"

Dane Elwell <dane.elwell@xxxxxxxxx> · Wed, 19 Feb 2014 10:50:10 +0000

Hello list,

We've recently moved our 588TB Ceph cluster into production by moving
VMs on there, but this morning we started receiving the following
message:

    cluster xxxxxxxxetc
     health HEALTH_WARN 20 requests are blocked > 32 sec
     monmap e1: 3 mons at
{ceph-mon-a=172.29.234.11:6789/0,ceph-mon-b=172.29.234.12:6789/0,ceph-mon-c=172.29.234.13:6789/0},
election epoch 306, quorum 0,1,2 ceph-mon-a,ceph-mon-b,ceph-mon-c
     osdmap e14763: 216 osds: 215 up, 215 in
      pgmap v6758511: 18072 pgs, 20 pools, 49884 GB data, 12451 kobjects
            99837 GB used, 488 TB / 586 TB avail
               18063 active+clean
                   9 active+clean+scrubbing+deep
  client io 3365 kB/s rd, 22576 kB/s wr, 1105 op/s

(We're down an OSD pending a disk replacement)

These delays appear to have happened all over our cluster, on
different hosts, at around the same time, so it doesn't appear to have
been a single host that had a problem:

root@ceph-mon-a:~# grep 'slow request' /var/log/ceph/ceph.log | awk
'{print $3}' | sort | uniq -c | sort -n
      2 osd.67
      3 osd.11
      4 osd.104
      4 osd.148
      4 osd.193
      5 osd.151
      6 osd.0
      6 osd.190
      9 osd.136
     10 osd.215
     15 osd.47
     18 osd.202
     44 osd.124
     62 osd.51
     74 osd.34
     84 osd.9
    186 osd.86

root@ceph-mon-a:~# for x in c1m1 c1m2 c2m1 c2m2 c3m1 c3m2; do echo
${x} && ssh ${x} mount | egrep
"\-(67|11|104|148|193|151|0|190|136|215|47|202|124|51|34|9|86) ";
done;
c1m1
/dev/sdaa1 on /var/lib/ceph/osd/ceph-0 type xfs (rw,noatime)
c1m2
/dev/sdm1 on /var/lib/ceph/osd/ceph-67 type xfs (rw,noatime)
/dev/sdai1 on /var/lib/ceph/osd/ceph-193 type xfs (rw,noatime)
/dev/sdal1 on /var/lib/ceph/osd/ceph-202 type xfs (rw,noatime)
c2m1
/dev/sds1 on /var/lib/ceph/osd/ceph-104 type xfs (rw,noatime)
/dev/sdp1 on /var/lib/ceph/osd/ceph-86 type xfs (rw,noatime)
/dev/sdaa1 on /var/lib/ceph/osd/ceph-151 type xfs (rw,noatime)
c2m2
/dev/sdg1 on /var/lib/ceph/osd/ceph-34 type xfs (rw,noatime)
/dev/sdag1 on /var/lib/ceph/osd/ceph-215 type xfs (rw,noatime)
c3m1
/dev/sdi1 on /var/lib/ceph/osd/ceph-47 type xfs (rw,noatime)
/dev/sdc1 on /var/lib/ceph/osd/ceph-11 type xfs (rw,noatime)
/dev/sdv1 on /var/lib/ceph/osd/ceph-136 type xfs (rw,noatime)
/dev/sdab1 on /var/lib/ceph/osd/ceph-124 type xfs (rw,noatime)
/dev/sdad1 on /var/lib/ceph/osd/ceph-148 type xfs (rw,noatime)
c3m2
/dev/sdd1 on /var/lib/ceph/osd/ceph-9 type xfs (rw,noatime)
/dev/sdm1 on /var/lib/ceph/osd/ceph-51 type xfs (rw,noatime)
/dev/sdal1 on /var/lib/ceph/osd/ceph-190 type xfs (rw,noatime)

Log excerpt from around the middle of the time this was happening:

2014-02-19 08:42:41.246842 osd.151 172.29.234.23:6807/179269 6696 :
[WRN] slow request 30.866614 seconds old, received at 2014-02-19
08:42:10.378934: osd_op(client.94103.1:136181794
rb.0.16fe8.238e1f29.00000005c3de [write 188416~4096] 5.7a1ac3a3
e14763) currently waiting for subops from [51]
2014-02-19 08:42:41.247117 osd.151 172.29.234.23:6807/179269 6697 :
[WRN] slow request 30.866401 seconds old, received at 2014-02-19
08:42:10.379147: osd_op(client.94103.1:136181795
rb.0.16fe8.238e1f29.00000005c3de [write 1077248~12288] 5.7a1ac3a3
e14763) currently waiting for subops from [51]
2014-02-19 08:42:41.247125 osd.151 172.29.234.23:6807/179269 6698 :
[WRN] slow request 30.865893 seconds old, received at 2014-02-19
08:42:10.379655: osd_op(client.94103.1:136181796
rb.0.16fe8.238e1f29.00000005c3de [write 1093632~12288] 5.7a1ac3a3
e14763) currently waiting for subops from [51]
2014-02-19 08:42:41.247133 osd.151 172.29.234.23:6807/179269 6699 :
[WRN] slow request 30.865366 seconds old, received at 2014-02-19
08:42:10.380182: osd_op(client.94103.1:136181797
rb.0.16fe8.238e1f29.00000005c3de [write 1126400~12288] 5.7a1ac3a3
e14763) currently waiting for subops from [51]
2014-02-19 08:42:49.096313 osd.202 172.29.234.22:6842/6625 4937 :
[WRN] 12 slow requests, 1 included below; oldest blocked for >
45.689579 secs
2014-02-19 08:42:49.096324 osd.202 172.29.234.22:6842/6625 4938 :
[WRN] slow request 30.538718 seconds old, received at 2014-02-19
08:42:18.557537: osd_op(client.94103.1:136187133
rb.0.3df06.238e1f29.000000023481 [write 118784~4096] 5.a3a00d5a
e14763) currently waiting for subops from [51]

We're running Ubuntu 12.04 with a 3.8 kernel (package
'linux-image-generic-lts-raring' from official repo) and Ceph version
0.72-1precise. Storage hardware is SGI MIS[1] servers with 32-40 disks
(and therefore OSDs) per host, 24 CPU cores and 64GB of RAM. FWIW, the
three monitors are vastly overpowered Dell R710s with 16 cores and
32GB of RAM each. Load on the OSD hosts fluctuates between 4.00-8.00,
iostat shows iowait around 10% normally, but don't have these for when
the issue was occurring.

I suspect there's not much I can do until the problem happens again
and I can diagnose it during the issue, but any help or advice towards
diagnosing the root cause of this would be much appreciated!

[1] - https://www.sgi.com/products/storage/modular/server.html

Regards

Dane
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com