Re: Slow Request on OSD

Reed Dier <reed.dier@xxxxxxxxxxx> · Wed, 31 Aug 2016 16:21:14 -0500

Multiple XFS corruptions, multiple leveldb issues. Looked to be result of write cache settings which have been adjusted now.

You’ll see below that there are tons of PG’s in bad states, and it was slowly but surely bringing the number of bad PGs down, but it seems to have hit a brick wall with this one slow request operation.

ceph -s
cluster []
     health HEALTH_ERR
            292 pgs are stuck inactive for more than 300 seconds
            142 pgs backfill_wait
            135 pgs degraded
            63 pgs down
            80 pgs incomplete
            199 pgs inconsistent
            2 pgs recovering
            5 pgs recovery_wait
            1 pgs repair
            132 pgs stale
            160 pgs stuck inactive
            132 pgs stuck stale
            71 pgs stuck unclean
            128 pgs undersized
            1 requests are blocked > 32 sec
            recovery 5301381/46255447 objects degraded (11.461%)
            recovery 6335505/46255447 objects misplaced (13.697%)
            recovery 131/20781800 unfound (0.001%)
            14943 scrub errors
            mds cluster is degraded
     monmap e1: 3 mons at {core=[]:6789/0,db=[]:6789/0,dev=[]:6789/0}
            election epoch 262, quorum 0,1,2 core,dev,db
      fsmap e3627: 1/1/1 up {0=core=up:replay}
     osdmap e3685: 8 osds: 8 up, 8 in; 153 remapped pgs
            flags sortbitwise
      pgmap v1807138: 744 pgs, 10 pools, 7668 GB data, 20294 kobjects
            8998 GB used, 50598 GB / 59596 GB avail
            5301381/46255447 objects degraded (11.461%)
            6335505/46255447 objects misplaced (13.697%)
            131/20781800 unfound (0.001%)
                 209 active+clean
                 170 active+clean+inconsistent
                 112 stale+active+clean
                  74 undersized+degraded+remapped+wait_backfill+peered
                  63 down+incomplete
                  48 active+undersized+degraded+remapped+wait_backfill
                  19 stale+active+clean+inconsistent
                  17 incomplete
                  12 active+remapped+wait_backfill
                   5 active+recovery_wait+degraded
                   4 undersized+degraded+remapped+inconsistent+wait_backfill+peered
                   4 active+remapped+inconsistent+wait_backfill
                   2 active+recovering+degraded
                   2 undersized+degraded+remapped+peered
                   1 stale+active+clean+scrubbing+deep+inconsistent+repair
                   1 active+clean+scrubbing+deep
                   1 active+clean+scrubbing+inconsistent

Thanks,

Reed

On Aug 31, 2016, at 4:08 PM, Wido den Hollander <wido@xxxxxxxx> wrote:

Op 31 augustus 2016 om 22:56 schreef Reed Dier <reed.dier@xxxxxxxxxxx>:

After a power failure left our jewel cluster crippled, I have hit a sticking point in attempted recovery.

Out of 8 osd’s, we likely lost 5-6, trying to salvage what we can.

That's probably to much. How do you mean lost? Is XFS crippled/corrupted? That shouldn't happen.

In addition to rados pools, we were also using CephFS, and the cephfs.metadata and cephfs.data pools likely lost plenty of PG’s.

What is the status of all PGs? What does 'ceph -s' show?

Are all PGs active? Since that's something which needs to be done first.

The mds has reported this ever since returning from the power loss:
# ceph mds stat
e3627: 1/1/1 up {0=core=up:replay}

When looking at the slow request on the osd, it shows this task which I can’t quite figure out. Any help appreciated.

Are all clients (including MDS) and OSDs running the same version?

Wido

# ceph --admin-daemon /var/run/ceph/ceph-osd.5.asok dump_ops_in_flight
{
   "ops": [
       {
           "description": "osd_op(mds.0.3625:8 6.c5265ab3 (undecoded) ack+retry+read+known_if_redirected+full_force e3668)",
           "initiated_at": "2016-08-31 10:37:18.833644",
           "age": 22212.235361,
           "duration": 22212.235379,
           "type_data": [
               "no flag points reached",
               [
                   {
                       "time": "2016-08-31 10:37:18.833644",
                       "event": "initiated"
                   }
               ]
           ]
       }
   ],
   "num_ops": 1
}

Thanks,

Reed
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com