I hope the daemon mds.icadmin011 is running on the same machine that you are looking for /tmp/dump.txt, since the file is created on the system which has that daemon running. On Wed, May 24, 2023 at 2:16 PM Emmanuel Jaep <emmanuel.jaep@xxxxxxxxx> wrote: > Hi Milind, > > you are absolutely right. > > The dump_ops_in_flight is giving a good hint about what's happening: > { > "ops": [ > { > "description": "internal op exportdir:mds.5:975673", > "initiated_at": "2023-05-23T17:49:53.030611+0200", > "age": 60596.355186077999, > "duration": 60596.355234167997, > "type_data": { > "flag_point": "failed to wrlock, waiting", > "reqid": "mds.5:975673", > "op_type": "internal_op", > "internal_op": 5377, > "op_name": "exportdir", > "events": [ > { > "time": "2023-05-23T17:49:53.030611+0200", > "event": "initiated" > }, > { > "time": "2023-05-23T17:49:53.030611+0200", > "event": "throttled" > }, > { > "time": "2023-05-23T17:49:53.030611+0200", > "event": "header_read" > }, > { > "time": "2023-05-23T17:49:53.030611+0200", > "event": "all_read" > }, > { > "time": "2023-05-23T17:49:53.030611+0200", > "event": "dispatched" > }, > { > "time": "2023-05-23T17:49:53.030657+0200", > "event": "requesting remote authpins" > }, > { > "time": "2023-05-23T17:49:53.050253+0200", > "event": "failed to wrlock, waiting" > } > ] > } > } > ], > "num_ops": 1 > } > > However, the dump cache does not seem to produce an output: > root@icadmin011:~# ceph --cluster floki daemon mds.icadmin011 dump cache > /tmp/dump.txt > root@icadmin011:~# ls /tmp > ssh-cHvP3iF611 > > systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-ceph-mds@icadmin011.service-SGZrKf > > systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-logind.service-uU1GAi > > systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-resolved.service-KYHd7f > > systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-timesyncd.service-1Qtj5i > > Do you have any hint? > > Best, > > Emmanuel > > On Wed, May 24, 2023 at 10:30 AM Milind Changire <mchangir@xxxxxxxxxx> > wrote: > >> Emmanuel, >> You probably missed the "daemon" keyword after the "ceph" command name. >> Here's the docs for pacific: >> https://docs.ceph.com/en/pacific/cephfs/troubleshooting/ >> >> So, your command should've been: >> # ceph daemon mds.icadmin011 dump cache /tmp/dump.txt >> >> You could also dump the ops in flight with: >> # ceph daemon mds.icadmin011 dump_ops_in_flight >> >> >> >> On Wed, May 24, 2023 at 1:38 PM Emmanuel Jaep <emmanuel.jaep@xxxxxxxxx> >> wrote: >> >> > Hi, >> > >> > we are running a cephfs cluster with the following version: >> > ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific >> > (stable) >> > >> > Several MDSs are reporting slow requests: >> > HEALTH_WARN 4 MDSs report slow requests >> > [WRN] MDS_SLOW_REQUEST: 4 MDSs report slow requests >> > mds.icadmin011(mds.5): 1 slow requests are blocked > 30 secs >> > mds.icadmin015(mds.6): 2 slow requests are blocked > 30 secs >> > mds.icadmin006(mds.4): 8 slow requests are blocked > 30 secs >> > mds.icadmin007(mds.2): 2 slow requests are blocked > 30 secs >> > >> > According to Quincy's documentation ( >> > https://docs.ceph.com/en/quincy/cephfs/troubleshooting/), this can be >> > investigated by issuing: >> > ceph mds.icadmin011 dump cache /tmp/dump.txt >> > >> > Unfortunately, this command fails: >> > no valid command found; 10 closest matches: >> > pg stat >> > pg getmap >> > pg dump [all|summary|sum|delta|pools|osds|pgs|pgs_brief...] >> > pg dump_json [all|summary|sum|pools|osds|pgs...] >> > pg dump_pools_json >> > pg ls-by-pool <poolstr> [<states>...] >> > pg ls-by-primary <id|osd.id> [<pool:int>] [<states>...] >> > pg ls-by-osd <id|osd.id> [<pool:int>] [<states>...] >> > pg ls [<pool:int>] [<states>...] >> > pg dump_stuck [inactive|unclean|stale|undersized|degraded...] >> > [<threshold:int>] >> > Error EINVAL: invalid command >> > >> > >> > I imagine that it is related to the fact that we are running the Pacific >> > version and not the Quincy version. >> > >> > When looking at the Pacific's documentation ( >> > https://docs.ceph.com/en/pacific/cephfs/health-messages/), I should: >> > > Use the ops admin socket command to list outstanding metadata >> operations. >> > >> > Unfortunately, I fail to really understand what I'm supposed to do. Can >> > someone give a pointer? >> > >> > Best, >> > >> > Emmanuel >> > _______________________________________________ >> > ceph-users mailing list -- ceph-users@xxxxxxx >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > >> > >> >> -- >> Milind >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > -- Milind _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx