Re: MDS_SLOW_METADATA_IO

Stefan Kooman <stefan@xxxxxx> · Sun, 3 Mar 2019 15:20:26 +0100

Quoting Patrick Donnelly (pdonnell@xxxxxxxxxx):
> On Thu, Feb 28, 2019 at 12:49 PM Stefan Kooman <stefan@xxxxxx> wrote:
> >
> > Dear list,
> >
> > After upgrading to 12.2.11 the MDSes are reporting slow metadata IOs
> > (MDS_SLOW_METADATA_IO). The metadata IOs would have been blocked for
> > more that 5 seconds. We have one active, and one active standby MDS. All
> > storage on SSD (Samsung PM863a / Intel DC4500). No other (OSD) slow ops
> > reported. The MDSes are underutilized, only a handful of active clients
> > and almost no load (fast hexacore CPU, 256 GB RAM, 20 Gb/s network). The
> > cluster is also far from busy.
> >
> > I've dumped ops in flight on the MDSes but all ops that are printed are
> > finished in a split second (duration: 0.000152), flag_point": "acquired
> > locks".
> 
> I believe you're looking at the wrong "ops" dump. You want to check
> "objector_requests".

I've done that, but not much timing info in there, i.e.:

    "ops": [
        {
            "tid": 416231,
            "pg": "6.791ce4c5",
            "osd": 31,
            "object_id": "200.00024d0f",
            "object_locator": "@6",
            "target_object_id": "200.00024d0f",
            "target_object_locator": "@6",
            "paused": 0,
            "used_replica": 0,
            "precalc_pgid": 0,
            "last_sent": "79790s",
            "attempts": 1,
            "snapid": "head",
            "snap_context": "0=[]",
            "mtime": "2019-03-01 16:39:29.0.41847s",
            "osd_ops": [
                "write 798204~6249 [fadvise_dontneed]"
            ]
        }

The MDS_SLOW_METADATA_IO warning has been added in commit
https://github.com/ceph/ceph/commit/0f735f40315448560fde049ed3ea019a7d30d868#diff-74784f821b1aae68f768965680914268

Tracker issue: http://tracker.ceph.com/issues/24879
PR: 23022

It uses the "osd_op_complaint_time" as the "complaint_time". We have set
the osd_op_complaint_time to 5 seconds. 

The tracker issue hints that slow metadata would be caused by a slow OSD
(joural writes of MDS that take too long). However, we do not see any
slow mds ops (data), nor slow ops. We've changed the parameter
"mon_osd_warn_op_age" to 5 seconds as well so slow OSD ops would be
reported as well.

I do not think our OSDs are that slow, that often (~ at least every 15
minutes we would see that warning) ... so I really wonder what's going
on.

For now we've set the osd_op_complaint_time to "30" on the MDSes and do
not see any SLOW METADATA warnings anymore.

Anyone willing to set the osd_op_complaint_time to 5 seconds to see if
those messages will start popping up? ceph tell mds.* config set
osd_op_complaint_time 5 should do the trick ... you would need to be
running 12.2.11 and /or mimic 13.2.2.

Thanks,

Stefan

P.s. kudos to Wido den Hollander for helping us debug this.

-- 
| BIT BV  http://www.bit.nl/        Kamer van Koophandel 09090351
| GPG: 0xD14839C6                   +31 318 648 688 / info@xxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com