Hello, I am trying to debug slow operations in our cluster running Nautilus 14.2.13. I am analysing the output of "ceph daemon osd.N dump_historic_ops" command. I am noticing that the I am noticing that most of the time is spent between "header_read" and "throttled" events. For example, below is an operation that took ~160 seconds to complete and almost all of that time was spent between these 2 events. Going by the descriptions at https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-osd/#debugging-slow-requests - header_read: When the messenger first started reading the message off the wire. - throttled: When the messenger tried to acquire memory throttle space to read the message into memory. - all_read: When the messenger finished reading the message off the wire. Does this mean that the slowness I am observing is because OSD's messaging layer is not able to acquire the memory required for the message fast enough? The system has lots of available memory (over 300G), so how do I tune OSD to perform better at this? Appreciate any feedback on this. { "description": "osd_op(client.405792.0:98299 3.313 3:c8c63189:::rbd_data.51b046b8b4567.0000000000000180:head [set-alloc-hint object_size 4194304 write_size 4194304,writefull 0~4194304] snapc 0=[] ondisk+write+known_if_redirected e1073)", "initiated_at": "2020-11-06 16:16:40.924448", "age": 164.32155802899999, "duration": 159.57800813, "type_data": { "flag_point": "commit sent; apply or cleanup", "client_info": { "client": "client.405792", "client_addr": "v1:x.y.156.101:0/3840080733", "tid": 98299 }, "events": [ { "time": "2020-11-06 16:16:40.924448", "event": "initiated" }, { "time": "2020-11-06 16:16:40.924448", "event": "header_read" }, { "time": "2020-11-06 16:19:20.481593", "event": "throttled" }, { "time": "2020-11-06 16:19:20.487331", "event": "all_read" }, { "time": "2020-11-06 16:19:20.487333", "event": "dispatched" }, { "time": "2020-11-06 16:19:20.487340", "event": "queued_for_pg" }, { "time": "2020-11-06 16:19:20.487372", "event": "reached_pg" }, { "time": "2020-11-06 16:19:20.487507", "event": "started" }, { "time": "2020-11-06 16:19:20.487586", "event": "waiting for subops from 1,94" }, { "time": "2020-11-06 16:19:20.491873", "event": "op_commit" }, { "time": "2020-11-06 16:19:20.501164", "event": "sub_op_commit_rec" }, { "time": "2020-11-06 16:19:20.502423", "event": "sub_op_commit_rec" }, { "time": "2020-11-06 16:19:20.502438", "event": "commit_sent" }, { "time": "2020-11-06 16:19:20.502456", "event": "done" } ] } } _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx