This issue did subside after restarting the original primary daemon and failing back to it. I've since enabled multi-MDS and latencies overall have decreased even further. Thanks for your assistance. On Wed, Apr 15, 2020 at 8:32 AM Josh Haft <paccrap@xxxxxxxxx> wrote: > > Thanks for the assistance. > > I restarted the active MDS Monday evening, which caused a standby to > pick up the rank; latencies immediately decreased and returned to > normal range throughout yesterday. Both primary and standby MDS' had > been restarted when upgraded last Friday evening, but not since. I'll > 'fail' back to the primary over the weekend to see whether the problem > returns. > > Josh > > > On Wed, Apr 15, 2020 at 2:47 AM Yan, Zheng <ukernel@xxxxxxxxx> wrote: > > > > On Tue, Apr 14, 2020 at 6:17 AM Josh Haft <paccrap@xxxxxxxxx> wrote: > > > > > > On Mon, Apr 13, 2020 at 4:14 PM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > > > > > > > On Mon, Apr 13, 2020 at 1:33 PM Josh Haft <paccrap@xxxxxxxxx> wrote: > > > > > > > > > > Hi, > > > > > > > > > > I upgraded from 13.2.5 to 14.2.6 last week and am now seeing > > > > > significantly higher latency on various MDS operations. For example, > > > > > the 2min rate of ceph_mds_server_req_create_latency_sum / > > > > > ceph_mds_server_req_create_latency_count for an 8hr window last Monday > > > > > prior to the upgrade was an average of 2ms. Today, however the same > > > > > stat shows 869ms. Other operations including open, readdir, rmdir, > > > > > etc. are also taking significantly longer. > > > > > > > > > > Here's a partial example of an op from dump_ops_in_flight: > > > > > { > > > > > "description": "client_request(client.342513090:334359409 > > > > > create #...)", > > > > > "initiated_at": "2020-04-13 15:30:15.707637", > > > > > "age": 0.19583208099999999, > > > > > "duration": 0.19767626299999999, > > > > > "type_data": { > > > > > "flag_point": "submit entry: journal_and_reply", > > > > > "reqid": "client.342513090:334359409", > > > > > "op_type": "client_request", > > > > > "client_info": { > > > > > "client": "client.342513090", > > > > > "tid": 334359409 > > > > > }, > > > > > "events": [ > > > > > { > > > > > "time": "2020-04-13 15:30:15.707637", > > > > > "event": "initiated" > > > > > }, > > > > > { > > > > > "time": "2020-04-13 15:30:15.707637", > > > > > "event": "header_read" > > > > > }, > > > > > { > > > > > "time": "2020-04-13 15:30:15.707638", > > > > > "event": "throttled" > > > > > }, > > > > > { > > > > > "time": "2020-04-13 15:30:15.707640", > > > > > "event": "all_read" > > > > > }, > > > > > { > > > > > "time": "2020-04-13 15:30:15.781935", > > > > > "event": "dispatched" > > > > > }, > > > > > { > > > > > "time": "2020-04-13 15:30:15.785086", > > > > > "event": "acquired locks" > > > > > }, > > > > > { > > > > > "time": "2020-04-13 15:30:15.785507", > > > > > "event": "early_replied" > > > > > }, > > > > > { > > > > > "time": "2020-04-13 15:30:15.785508", > > > > > "event": "submit entry: journal_and_reply" > > > > > } > > > > > ] > > > > > } > > > > > } > > > > > > > > > > > > > > > This along with every other 'create' op I've seen has a 50ms+ delay > > > > > between all_read and dispatched events - what is happening during this > > > > > time? > > > > > > > > The incoming client request messages are read on a messenger thread, > > > > and then they get transferred into a queue for the main server event > > > > loop. "all_read" means the message has been read off the wire (in the > > > > messenger reader thread), and then "dispatched' means it the main > > > > server loop thread has started work. > > > > So a long wait in that period has nothing to do with the particular > > > > message which is waiting, it just indicates that the server is > > > > somewhat backed up. > > > > > > > Thanks for the clarification - the total op time does include the wait > > > between those two events, though - correct? > > > > > > > In this case, once processing started it took about 3 ms to get locks > > > > (which IIRC probably means it had to queue and wait for something else > > > > to happen) and then sent back a reply and queued up the journal event > > > > (ie, sent off the write to disk) in another half a millisecond. > > > > > > > > > > > > Anyway, I don't off-hand know of any reason things should have slowed > > > > down (really, I'd expect them to be a little faster as a bit more > > > > stuff got parallelized). Have the number of requests being processed > > > > by the MDS changed? Are your clients noticing the apparent new > > > > slowness? > > > > > > > The client workload should be roughly the same, though the number > > > being processed has dropped from over 6000/s to under 500/s, > > > presumably because of the increase in time to process each. Clients > > > are definitely noticing this increased latency. > > > > > > > are your ceph binaries debug version? > > > > > Thanks, > > > Josh > > > > > > > > > > > I'm not sure what I'm looking for the in the MDS debug logs. > > > > > > > > > > We have a mix of clients from 12.2.x through 14.2.8; my plan was to > > > > > upgrade those pre-Nautilus clients this week. There is only a single > > > > > MDS rank with 1 backup. Other functions of this cluster - RBDs and RGW > > > > > - do not appear impacted so this looks limited to the MDS. I did not > > > > > observe this behavior after upgrading a dev cluster last month. > > > > > > > > > > Has anyone seen anything similar? Thanks for any assistance! > > > > > Josh > > > > > _______________________________________________ > > > > > ceph-users mailing list -- ceph-users@xxxxxxx > > > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > > > > > > > > > _______________________________________________ > > > ceph-users mailing list -- ceph-users@xxxxxxx > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx