Re: Nautilus upgrade causes spike in MDS latency

Josh Haft <paccrap@xxxxxxxxx> · Mon, 13 Apr 2020 17:16:30 -0500



On Mon, Apr 13, 2020 at 4:14 PM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>
> On Mon, Apr 13, 2020 at 1:33 PM Josh Haft <paccrap@xxxxxxxxx> wrote:
> >
> > Hi,
> >
> > I upgraded from 13.2.5 to 14.2.6 last week and am now seeing
> > significantly higher latency on various MDS operations. For example,
> > the 2min rate of ceph_mds_server_req_create_latency_sum /
> > ceph_mds_server_req_create_latency_count for an 8hr window last Monday
> > prior to the upgrade was an average of 2ms. Today, however the same
> > stat shows 869ms. Other operations including open, readdir, rmdir,
> > etc. are also taking significantly longer.
> >
> > Here's a partial example of an op from dump_ops_in_flight:
> >         {
> >             "description": "client_request(client.342513090:334359409
> > create #...)",
> >             "initiated_at": "2020-04-13 15:30:15.707637",
> >             "age": 0.19583208099999999,
> >             "duration": 0.19767626299999999,
> >             "type_data": {
> >                 "flag_point": "submit entry: journal_and_reply",
> >                 "reqid": "client.342513090:334359409",
> >                 "op_type": "client_request",
> >                 "client_info": {
> >                     "client": "client.342513090",
> >                     "tid": 334359409
> >                 },
> >                 "events": [
> >                     {
> >                         "time": "2020-04-13 15:30:15.707637",
> >                         "event": "initiated"
> >                     },
> >                     {
> >                         "time": "2020-04-13 15:30:15.707637",
> >                         "event": "header_read"
> >                     },
> >                     {
> >                         "time": "2020-04-13 15:30:15.707638",
> >                         "event": "throttled"
> >                     },
> >                     {
> >                         "time": "2020-04-13 15:30:15.707640",
> >                         "event": "all_read"
> >                     },
> >                     {
> >                         "time": "2020-04-13 15:30:15.781935",
> >                         "event": "dispatched"
> >                     },
> >                     {
> >                         "time": "2020-04-13 15:30:15.785086",
> >                         "event": "acquired locks"
> >                     },
> >                     {
> >                         "time": "2020-04-13 15:30:15.785507",
> >                         "event": "early_replied"
> >                     },
> >                     {
> >                         "time": "2020-04-13 15:30:15.785508",
> >                         "event": "submit entry: journal_and_reply"
> >                     }
> >                 ]
> >             }
> >         }
> >
> >
> > This along with every other 'create' op I've seen has a 50ms+ delay
> > between all_read and dispatched events - what is happening during this
> > time?
>
> The incoming client request messages are read on a messenger thread,
> and then they get transferred into a queue for the main server event
> loop. "all_read" means the message has been read off the wire (in the
> messenger reader thread), and then "dispatched' means it the main
> server loop thread has started work.
> So a long wait in that period has nothing to do with the particular
> message which is waiting, it just indicates that the server is
> somewhat backed up.
>
Thanks for the clarification - the total op time does include the wait
between those two events, though - correct?

> In this case, once processing started it took about 3 ms to get locks
> (which IIRC probably means it had to queue and wait for something else
> to happen) and then sent back a reply and queued up the journal event
> (ie, sent off the write to disk) in another half a millisecond.
>
>
> Anyway, I don't off-hand know of any reason things should have slowed
> down (really, I'd expect them to be a little faster as a bit more
> stuff got parallelized). Have the number of requests being processed
> by the MDS changed? Are your clients noticing the apparent new
> slowness?
>
The client workload should be roughly the same, though the number
being processed has dropped from over 6000/s to under 500/s,
presumably because of the increase in time to process each. Clients
are definitely noticing this increased latency.

Thanks,
Josh


> > I'm not sure what I'm looking for the in the MDS debug logs.
> >
> > We have a mix of clients from 12.2.x through 14.2.8; my plan was to
> > upgrade those pre-Nautilus clients this week. There is only a single
> > MDS rank with 1 backup. Other functions of this cluster - RBDs and RGW
> > - do not appear impacted so this looks limited to the MDS. I did not
> > observe this behavior after upgrading a dev cluster last month.
> >
> > Has anyone seen anything similar? Thanks for any assistance!
> > Josh
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx