Re: Nautilus upgrade causes spike in MDS latency

Josh Haft <paccrap@xxxxxxxxx> · Wed, 15 Apr 2020 08:32:29 -0500

Thanks for the assistance.

I restarted the active MDS Monday evening, which caused a standby to
pick up the rank; latencies immediately decreased and returned to
normal range throughout yesterday. Both primary and standby MDS' had
been restarted when upgraded last Friday evening, but not since. I'll
'fail' back to the primary over the weekend to see whether the problem
returns.

Josh

On Wed, Apr 15, 2020 at 2:47 AM Yan, Zheng <ukernel@xxxxxxxxx> wrote:
>
> On Tue, Apr 14, 2020 at 6:17 AM Josh Haft <paccrap@xxxxxxxxx> wrote:
> >
> > On Mon, Apr 13, 2020 at 4:14 PM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
> > >
> > > On Mon, Apr 13, 2020 at 1:33 PM Josh Haft <paccrap@xxxxxxxxx> wrote:
> > > >
> > > > Hi,
> > > >
> > > > I upgraded from 13.2.5 to 14.2.6 last week and am now seeing
> > > > significantly higher latency on various MDS operations. For example,
> > > > the 2min rate of ceph_mds_server_req_create_latency_sum /
> > > > ceph_mds_server_req_create_latency_count for an 8hr window last Monday
> > > > prior to the upgrade was an average of 2ms. Today, however the same
> > > > stat shows 869ms. Other operations including open, readdir, rmdir,
> > > > etc. are also taking significantly longer.
> > > >
> > > > Here's a partial example of an op from dump_ops_in_flight:
> > > >         {
> > > >             "description": "client_request(client.342513090:334359409
> > > > create #...)",
> > > >             "initiated_at": "2020-04-13 15:30:15.707637",
> > > >             "age": 0.19583208099999999,
> > > >             "duration": 0.19767626299999999,
> > > >             "type_data": {
> > > >                 "flag_point": "submit entry: journal_and_reply",
> > > >                 "reqid": "client.342513090:334359409",
> > > >                 "op_type": "client_request",
> > > >                 "client_info": {
> > > >                     "client": "client.342513090",
> > > >                     "tid": 334359409
> > > >                 },
> > > >                 "events": [
> > > >                     {
> > > >                         "time": "2020-04-13 15:30:15.707637",
> > > >                         "event": "initiated"
> > > >                     },
> > > >                     {
> > > >                         "time": "2020-04-13 15:30:15.707637",
> > > >                         "event": "header_read"
> > > >                     },
> > > >                     {
> > > >                         "time": "2020-04-13 15:30:15.707638",
> > > >                         "event": "throttled"
> > > >                     },
> > > >                     {
> > > >                         "time": "2020-04-13 15:30:15.707640",
> > > >                         "event": "all_read"
> > > >                     },
> > > >                     {
> > > >                         "time": "2020-04-13 15:30:15.781935",
> > > >                         "event": "dispatched"
> > > >                     },
> > > >                     {
> > > >                         "time": "2020-04-13 15:30:15.785086",
> > > >                         "event": "acquired locks"
> > > >                     },
> > > >                     {
> > > >                         "time": "2020-04-13 15:30:15.785507",
> > > >                         "event": "early_replied"
> > > >                     },
> > > >                     {
> > > >                         "time": "2020-04-13 15:30:15.785508",
> > > >                         "event": "submit entry: journal_and_reply"
> > > >                     }
> > > >                 ]
> > > >             }
> > > >         }
> > > >
> > > >
> > > > This along with every other 'create' op I've seen has a 50ms+ delay
> > > > between all_read and dispatched events - what is happening during this
> > > > time?
> > >
> > > The incoming client request messages are read on a messenger thread,
> > > and then they get transferred into a queue for the main server event
> > > loop. "all_read" means the message has been read off the wire (in the
> > > messenger reader thread), and then "dispatched' means it the main
> > > server loop thread has started work.
> > > So a long wait in that period has nothing to do with the particular
> > > message which is waiting, it just indicates that the server is
> > > somewhat backed up.
> > >
> > Thanks for the clarification - the total op time does include the wait
> > between those two events, though - correct?
> >
> > > In this case, once processing started it took about 3 ms to get locks
> > > (which IIRC probably means it had to queue and wait for something else
> > > to happen) and then sent back a reply and queued up the journal event
> > > (ie, sent off the write to disk) in another half a millisecond.
> > >
> > >
> > > Anyway, I don't off-hand know of any reason things should have slowed
> > > down (really, I'd expect them to be a little faster as a bit more
> > > stuff got parallelized). Have the number of requests being processed
> > > by the MDS changed? Are your clients noticing the apparent new
> > > slowness?
> > >
> > The client workload should be roughly the same, though the number
> > being processed has dropped from over 6000/s to under 500/s,
> > presumably because of the increase in time to process each. Clients
> > are definitely noticing this increased latency.
> >
>
> are your ceph binaries debug version?
>
> > Thanks,
> > Josh
> >
> >
> > > > I'm not sure what I'm looking for the in the MDS debug logs.
> > > >
> > > > We have a mix of clients from 12.2.x through 14.2.8; my plan was to
> > > > upgrade those pre-Nautilus clients this week. There is only a single
> > > > MDS rank with 1 backup. Other functions of this cluster - RBDs and RGW
> > > > - do not appear impacted so this looks limited to the MDS. I did not
> > > > observe this behavior after upgrading a dev cluster last month.
> > > >
> > > > Has anyone seen anything similar? Thanks for any assistance!
> > > > Josh
> > > > _______________________________________________
> > > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > > >
> > >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx