Absolutely! :-) root@icadmin011:/tmp# ceph --cluster floki daemon mds.icadmin011 dump cache /tmp/dump.txt root@icadmin011:/tmp# ll total 48 drwxrwxrwt 12 root root 4096 May 24 13:23 ./ drwxr-xr-x 18 root root 4096 Jun 9 2022 ../ drwxrwxrwt 2 root root 4096 May 4 12:43 .ICE-unix/ drwxrwxrwt 2 root root 4096 May 4 12:43 .Test-unix/ drwxrwxrwt 2 root root 4096 May 4 12:43 .X11-unix/ drwxrwxrwt 2 root root 4096 May 4 12:43 .XIM-unix/ drwxrwxrwt 2 root root 4096 May 4 12:43 .font-unix/ drwx------ 2 root root 4096 May 24 13:23 ssh-Sl5AiotnXp/ drwx------ 3 root root 4096 May 8 13:26 'systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-ceph-mds@icadmin011.service-SGZrKf '/ drwx------ 3 root root 4096 May 4 12:43 systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-logind.service-uU1GAi/ drwx------ 3 root root 4096 May 4 12:43 systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-resolved.service-KYHd7f/ drwx------ 3 root root 4096 May 4 12:43 systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-timesyncd.service-1Qtj5i/ On Wed, May 24, 2023 at 1:17 PM Milind Changire <mchangir@xxxxxxxxxx> wrote: > I hope the daemon mds.icadmin011 is running on the same machine that you > are looking for /tmp/dump.txt, since the file is created on the system > which has that daemon running. > > > On Wed, May 24, 2023 at 2:16 PM Emmanuel Jaep <emmanuel.jaep@xxxxxxxxx> > wrote: > >> Hi Milind, >> >> you are absolutely right. >> >> The dump_ops_in_flight is giving a good hint about what's happening: >> { >> "ops": [ >> { >> "description": "internal op exportdir:mds.5:975673", >> "initiated_at": "2023-05-23T17:49:53.030611+0200", >> "age": 60596.355186077999, >> "duration": 60596.355234167997, >> "type_data": { >> "flag_point": "failed to wrlock, waiting", >> "reqid": "mds.5:975673", >> "op_type": "internal_op", >> "internal_op": 5377, >> "op_name": "exportdir", >> "events": [ >> { >> "time": "2023-05-23T17:49:53.030611+0200", >> "event": "initiated" >> }, >> { >> "time": "2023-05-23T17:49:53.030611+0200", >> "event": "throttled" >> }, >> { >> "time": "2023-05-23T17:49:53.030611+0200", >> "event": "header_read" >> }, >> { >> "time": "2023-05-23T17:49:53.030611+0200", >> "event": "all_read" >> }, >> { >> "time": "2023-05-23T17:49:53.030611+0200", >> "event": "dispatched" >> }, >> { >> "time": "2023-05-23T17:49:53.030657+0200", >> "event": "requesting remote authpins" >> }, >> { >> "time": "2023-05-23T17:49:53.050253+0200", >> "event": "failed to wrlock, waiting" >> } >> ] >> } >> } >> ], >> "num_ops": 1 >> } >> >> However, the dump cache does not seem to produce an output: >> root@icadmin011:~# ceph --cluster floki daemon mds.icadmin011 dump cache >> /tmp/dump.txt >> root@icadmin011:~# ls /tmp >> ssh-cHvP3iF611 >> >> systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-ceph-mds@icadmin011.service-SGZrKf >> >> systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-logind.service-uU1GAi >> >> systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-resolved.service-KYHd7f >> >> systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-timesyncd.service-1Qtj5i >> >> Do you have any hint? >> >> Best, >> >> Emmanuel >> >> On Wed, May 24, 2023 at 10:30 AM Milind Changire <mchangir@xxxxxxxxxx> >> wrote: >> >>> Emmanuel, >>> You probably missed the "daemon" keyword after the "ceph" command name. >>> Here's the docs for pacific: >>> https://docs.ceph.com/en/pacific/cephfs/troubleshooting/ >>> >>> So, your command should've been: >>> # ceph daemon mds.icadmin011 dump cache /tmp/dump.txt >>> >>> You could also dump the ops in flight with: >>> # ceph daemon mds.icadmin011 dump_ops_in_flight >>> >>> >>> >>> On Wed, May 24, 2023 at 1:38 PM Emmanuel Jaep <emmanuel.jaep@xxxxxxxxx> >>> wrote: >>> >>> > Hi, >>> > >>> > we are running a cephfs cluster with the following version: >>> > ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific >>> > (stable) >>> > >>> > Several MDSs are reporting slow requests: >>> > HEALTH_WARN 4 MDSs report slow requests >>> > [WRN] MDS_SLOW_REQUEST: 4 MDSs report slow requests >>> > mds.icadmin011(mds.5): 1 slow requests are blocked > 30 secs >>> > mds.icadmin015(mds.6): 2 slow requests are blocked > 30 secs >>> > mds.icadmin006(mds.4): 8 slow requests are blocked > 30 secs >>> > mds.icadmin007(mds.2): 2 slow requests are blocked > 30 secs >>> > >>> > According to Quincy's documentation ( >>> > https://docs.ceph.com/en/quincy/cephfs/troubleshooting/), this can be >>> > investigated by issuing: >>> > ceph mds.icadmin011 dump cache /tmp/dump.txt >>> > >>> > Unfortunately, this command fails: >>> > no valid command found; 10 closest matches: >>> > pg stat >>> > pg getmap >>> > pg dump [all|summary|sum|delta|pools|osds|pgs|pgs_brief...] >>> > pg dump_json [all|summary|sum|pools|osds|pgs...] >>> > pg dump_pools_json >>> > pg ls-by-pool <poolstr> [<states>...] >>> > pg ls-by-primary <id|osd.id> [<pool:int>] [<states>...] >>> > pg ls-by-osd <id|osd.id> [<pool:int>] [<states>...] >>> > pg ls [<pool:int>] [<states>...] >>> > pg dump_stuck [inactive|unclean|stale|undersized|degraded...] >>> > [<threshold:int>] >>> > Error EINVAL: invalid command >>> > >>> > >>> > I imagine that it is related to the fact that we are running the >>> Pacific >>> > version and not the Quincy version. >>> > >>> > When looking at the Pacific's documentation ( >>> > https://docs.ceph.com/en/pacific/cephfs/health-messages/), I should: >>> > > Use the ops admin socket command to list outstanding metadata >>> operations. >>> > >>> > Unfortunately, I fail to really understand what I'm supposed to do. Can >>> > someone give a pointer? >>> > >>> > Best, >>> > >>> > Emmanuel >>> > _______________________________________________ >>> > ceph-users mailing list -- ceph-users@xxxxxxx >>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> > >>> > >>> >>> -- >>> Milind >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> >> > > -- > Milind > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx