Re: Slow Request on OSD

Nick Fisk <nick@xxxxxxxxxx> · Thu, 1 Sep 2016 08:50:16 +0100

> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Wido den Hollander
> Sent: 01 September 2016 08:19
> To: Reed Dier <reed.dier@xxxxxxxxxxx>
> Cc: ceph-users@xxxxxxxxxxxxxx
> Subject: Re:  Slow Request on OSD
> 
> 
> > Op 31 augustus 2016 om 23:21 schreef Reed Dier <reed.dier@xxxxxxxxxxx>:
> >
> >
> > Multiple XFS corruptions, multiple leveldb issues. Looked to be result of write cache settings which have been adjusted now.

Reed, I realise that you are probably very busy attempting recovery at the moment, but when things calm down, I think it would be very beneficial to the list if you could expand on what settings caused this to happen. It might just stop this happening to someone else in the future.

> >
> 
> That is bad news, really bad.
> 
> > You’ll see below that there are tons of PG’s in bad states, and it was slowly but surely bringing the number of bad PGs down, but it
> seems to have hit a brick wall with this one slow request operation.
> >
> 
> No, you have more issues. You can 17 PGs which are incomplete, a few down+incomplete.
> 
> Without those PGs functioning (active+X) your MDS will probably not work.
> 
> Take a look at: http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/
> 
> Make sure you go to HEALTH_WARN at first, in HEALTH_ERR the MDS will never come online.
> 
> Wido
> 
> > > ceph -s
> > > cluster []
> > >      health HEALTH_ERR
> > >             292 pgs are stuck inactive for more than 300 seconds
> > >             142 pgs backfill_wait
> > >             135 pgs degraded
> > >             63 pgs down
> > >             80 pgs incomplete
> > >             199 pgs inconsistent
> > >             2 pgs recovering
> > >             5 pgs recovery_wait
> > >             1 pgs repair
> > >             132 pgs stale
> > >             160 pgs stuck inactive
> > >             132 pgs stuck stale
> > >             71 pgs stuck unclean
> > >             128 pgs undersized
> > >             1 requests are blocked > 32 sec
> > >             recovery 5301381/46255447 objects degraded (11.461%)
> > >             recovery 6335505/46255447 objects misplaced (13.697%)
> > >             recovery 131/20781800 unfound (0.001%)
> > >             14943 scrub errors
> > >             mds cluster is degraded
> > >      monmap e1: 3 mons at {core=[]:6789/0,db=[]:6789/0,dev=[]:6789/0}
> > >             election epoch 262, quorum 0,1,2 core,dev,db
> > >       fsmap e3627: 1/1/1 up {0=core=up:replay}
> > >      osdmap e3685: 8 osds: 8 up, 8 in; 153 remapped pgs
> > >             flags sortbitwise
> > >       pgmap v1807138: 744 pgs, 10 pools, 7668 GB data, 20294 kobjects
> > >             8998 GB used, 50598 GB / 59596 GB avail
> > >             5301381/46255447 objects degraded (11.461%)
> > >             6335505/46255447 objects misplaced (13.697%)
> > >             131/20781800 unfound (0.001%)
> > >                  209 active+clean
> > >                  170 active+clean+inconsistent
> > >                  112 stale+active+clean
> > >                   74 undersized+degraded+remapped+wait_backfill+peered
> > >                   63 down+incomplete
> > >                   48 active+undersized+degraded+remapped+wait_backfill
> > >                   19 stale+active+clean+inconsistent
> > >                   17 incomplete
> > >                   12 active+remapped+wait_backfill
> > >                    5 active+recovery_wait+degraded
> > >                    4 undersized+degraded+remapped+inconsistent+wait_backfill+peered
> > >                    4 active+remapped+inconsistent+wait_backfill
> > >                    2 active+recovering+degraded
> > >                    2 undersized+degraded+remapped+peered
> > >                    1 stale+active+clean+scrubbing+deep+inconsistent+repair
> > >                    1 active+clean+scrubbing+deep
> > >                    1 active+clean+scrubbing+inconsistent
> >
> >
> > Thanks,
> >
> > Reed
> >
> > > On Aug 31, 2016, at 4:08 PM, Wido den Hollander <wido@xxxxxxxx> wrote:
> > >
> > >>
> > >> Op 31 augustus 2016 om 22:56 schreef Reed Dier <reed.dier@xxxxxxxxxxx <mailto:reed.dier@xxxxxxxxxxx>>:
> > >>
> > >>
> > >> After a power failure left our jewel cluster crippled, I have hit a sticking point in attempted recovery.
> > >>
> > >> Out of 8 osd’s, we likely lost 5-6, trying to salvage what we can.
> > >>
> > >
> > > That's probably to much. How do you mean lost? Is XFS crippled/corrupted? That shouldn't happen.
> > >
> > >> In addition to rados pools, we were also using CephFS, and the cephfs.metadata and cephfs.data pools likely lost plenty of PG’s.
> > >>
> > >
> > > What is the status of all PGs? What does 'ceph -s' show?
> > >
> > > Are all PGs active? Since that's something which needs to be done first.
> > >
> > >> The mds has reported this ever since returning from the power loss:
> > >>> # ceph mds stat
> > >>> e3627: 1/1/1 up {0=core=up:replay}
> > >>
> > >>
> > >> When looking at the slow request on the osd, it shows this task which I can’t quite figure out. Any help appreciated.
> > >>
> > >
> > > Are all clients (including MDS) and OSDs running the same version?
> > >
> > > Wido
> > >
> > >>> # ceph --admin-daemon /var/run/ceph/ceph-osd.5.asok
> > >>> dump_ops_in_flight {
> > >>>    "ops": [
> > >>>        {
> > >>>            "description": "osd_op(mds.0.3625:8 6.c5265ab3 (undecoded) ack+retry+read+known_if_redirected+full_force e3668)",
> > >>>            "initiated_at": "2016-08-31 10:37:18.833644",
> > >>>            "age": 22212.235361,
> > >>>            "duration": 22212.235379,
> > >>>            "type_data": [
> > >>>                "no flag points reached",
> > >>>                [
> > >>>                    {
> > >>>                        "time": "2016-08-31 10:37:18.833644",
> > >>>                        "event": "initiated"
> > >>>                    }
> > >>>                ]
> > >>>            ]
> > >>>        }
> > >>>    ],
> > >>>    "num_ops": 1
> > >>> }
> > >>
> > >> Thanks,
> > >>
> > >> Reed
> > >> _______________________________________________
> > >> ceph-users mailing list
> > >> ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
> > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com