Constant slow / blocked requests with otherwise healthy cluster

Oliver Schulz <oschulz@xxxxxxxxxx> · Wed, 27 Nov 2013 21:45:43 +0100

Dear Ceph Experts,

our Ceph cluster suddenly went into a state of OSDs constantly having
blocked or slow requests, rendering the cluster unusable. This happened
during normal use, there were no updates, etc.

All disks seem to be healthy (smartctl, iostat, etc.). A complete
hardware reboot including system update on all nodes has not helped.
The network equipment also shows no trouble.

We'd be glad for any advice on how to diagnose and solve this, as
the cluster is basically at a standstill and we urgently need
to get it back into operation.

Cluster structure: 6 Nodes, 6x 3TB disks plus 1x System/Journal SSD
per node, one OSD per disk. We're running ceph version 0.67.4-1precise
on Ubuntu 12.04.3 with kernel 3.8.0-33-generic (x86_64).

"ceph status" shows something like (it varies):

    cluster 899509fe-afe4-42f4-a555-bb044ca0f52d
     health HEALTH_WARN 77 requests are blocked > 32 sec
     monmap e1: 3 mons at {a=134.107.24.179:6789/0,b=134.107.24.181:6789/0,c=134.107.24.183:6789/0}, election epoch 312, quorum 0,1,2 a,b,c
     osdmap e32600: 36 osds: 36 up, 36 in
      pgmap v16404527: 14304 pgs: 14304 active+clean; 20153 GB data, 60630 GB used, 39923 GB / 100553 GB avail; 1506KB/s rd, 21246B/s wr, 545op/s
     mdsmap e478: 1/1/1 up {0=c=up:active}, 1 up:standby-replay

"ceph health detail" shows something like (it varies):

    HEALTH_WARN 363 requests are blocked > 32 sec; 22 osds have slow requests
    363 ops are blocked > 32.768 sec
    1 ops are blocked > 32.768 sec on osd.0
    8 ops are blocked > 32.768 sec on osd.3
    37 ops are blocked > 32.768 sec on osd.12
    [...]
    11 ops are blocked > 32.768 sec on osd.62
    45 ops are blocked > 32.768 sec on osd.65
    22 osds have slow requests

The number and identity of affected OSDs constantly changes
(sometimes health even goes to OK for a moment).

Cheers and thanks for any ideas,

Oliver
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com