try the command with the --id argument: # ceph --id admin --cluster floki daemon mds.icadmin011 dump cache /tmp/dump.txt I presume that your keyring has an appropriate entry for the client.admin user On Wed, May 24, 2023 at 5:10 PM Emmanuel Jaep <emmanuel.jaep@xxxxxxxxx> wrote: > Absolutely! :-) > > root@icadmin011:/tmp# ceph --cluster floki daemon mds.icadmin011 dump > cache /tmp/dump.txt > root@icadmin011:/tmp# ll > total 48 > drwxrwxrwt 12 root root 4096 May 24 13:23 ./ > drwxr-xr-x 18 root root 4096 Jun 9 2022 ../ > drwxrwxrwt 2 root root 4096 May 4 12:43 .ICE-unix/ > drwxrwxrwt 2 root root 4096 May 4 12:43 .Test-unix/ > drwxrwxrwt 2 root root 4096 May 4 12:43 .X11-unix/ > drwxrwxrwt 2 root root 4096 May 4 12:43 .XIM-unix/ > drwxrwxrwt 2 root root 4096 May 4 12:43 .font-unix/ > drwx------ 2 root root 4096 May 24 13:23 ssh-Sl5AiotnXp/ > drwx------ 3 root root 4096 May 8 13:26 > 'systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-ceph-mds@icadmin011.service-SGZrKf > '/ > drwx------ 3 root root 4096 May 4 12:43 > systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-logind.service-uU1GAi/ > drwx------ 3 root root 4096 May 4 12:43 > systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-resolved.service-KYHd7f/ > drwx------ 3 root root 4096 May 4 12:43 > systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-timesyncd.service-1Qtj5i/ > > On Wed, May 24, 2023 at 1:17 PM Milind Changire <mchangir@xxxxxxxxxx> > wrote: > >> I hope the daemon mds.icadmin011 is running on the same machine that you >> are looking for /tmp/dump.txt, since the file is created on the system >> which has that daemon running. >> >> >> On Wed, May 24, 2023 at 2:16 PM Emmanuel Jaep <emmanuel.jaep@xxxxxxxxx> >> wrote: >> >>> Hi Milind, >>> >>> you are absolutely right. >>> >>> The dump_ops_in_flight is giving a good hint about what's happening: >>> { >>> "ops": [ >>> { >>> "description": "internal op exportdir:mds.5:975673", >>> "initiated_at": "2023-05-23T17:49:53.030611+0200", >>> "age": 60596.355186077999, >>> "duration": 60596.355234167997, >>> "type_data": { >>> "flag_point": "failed to wrlock, waiting", >>> "reqid": "mds.5:975673", >>> "op_type": "internal_op", >>> "internal_op": 5377, >>> "op_name": "exportdir", >>> "events": [ >>> { >>> "time": "2023-05-23T17:49:53.030611+0200", >>> "event": "initiated" >>> }, >>> { >>> "time": "2023-05-23T17:49:53.030611+0200", >>> "event": "throttled" >>> }, >>> { >>> "time": "2023-05-23T17:49:53.030611+0200", >>> "event": "header_read" >>> }, >>> { >>> "time": "2023-05-23T17:49:53.030611+0200", >>> "event": "all_read" >>> }, >>> { >>> "time": "2023-05-23T17:49:53.030611+0200", >>> "event": "dispatched" >>> }, >>> { >>> "time": "2023-05-23T17:49:53.030657+0200", >>> "event": "requesting remote authpins" >>> }, >>> { >>> "time": "2023-05-23T17:49:53.050253+0200", >>> "event": "failed to wrlock, waiting" >>> } >>> ] >>> } >>> } >>> ], >>> "num_ops": 1 >>> } >>> >>> However, the dump cache does not seem to produce an output: >>> root@icadmin011:~# ceph --cluster floki daemon mds.icadmin011 dump >>> cache /tmp/dump.txt >>> root@icadmin011:~# ls /tmp >>> ssh-cHvP3iF611 >>> >>> systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-ceph-mds@icadmin011.service-SGZrKf >>> >>> systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-logind.service-uU1GAi >>> >>> systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-resolved.service-KYHd7f >>> >>> systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-timesyncd.service-1Qtj5i >>> >>> Do you have any hint? >>> >>> Best, >>> >>> Emmanuel >>> >>> On Wed, May 24, 2023 at 10:30 AM Milind Changire <mchangir@xxxxxxxxxx> >>> wrote: >>> >>>> Emmanuel, >>>> You probably missed the "daemon" keyword after the "ceph" command name. >>>> Here's the docs for pacific: >>>> https://docs.ceph.com/en/pacific/cephfs/troubleshooting/ >>>> >>>> So, your command should've been: >>>> # ceph daemon mds.icadmin011 dump cache /tmp/dump.txt >>>> >>>> You could also dump the ops in flight with: >>>> # ceph daemon mds.icadmin011 dump_ops_in_flight >>>> >>>> >>>> >>>> On Wed, May 24, 2023 at 1:38 PM Emmanuel Jaep <emmanuel.jaep@xxxxxxxxx> >>>> wrote: >>>> >>>> > Hi, >>>> > >>>> > we are running a cephfs cluster with the following version: >>>> > ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) >>>> pacific >>>> > (stable) >>>> > >>>> > Several MDSs are reporting slow requests: >>>> > HEALTH_WARN 4 MDSs report slow requests >>>> > [WRN] MDS_SLOW_REQUEST: 4 MDSs report slow requests >>>> > mds.icadmin011(mds.5): 1 slow requests are blocked > 30 secs >>>> > mds.icadmin015(mds.6): 2 slow requests are blocked > 30 secs >>>> > mds.icadmin006(mds.4): 8 slow requests are blocked > 30 secs >>>> > mds.icadmin007(mds.2): 2 slow requests are blocked > 30 secs >>>> > >>>> > According to Quincy's documentation ( >>>> > https://docs.ceph.com/en/quincy/cephfs/troubleshooting/), this can be >>>> > investigated by issuing: >>>> > ceph mds.icadmin011 dump cache /tmp/dump.txt >>>> > >>>> > Unfortunately, this command fails: >>>> > no valid command found; 10 closest matches: >>>> > pg stat >>>> > pg getmap >>>> > pg dump [all|summary|sum|delta|pools|osds|pgs|pgs_brief...] >>>> > pg dump_json [all|summary|sum|pools|osds|pgs...] >>>> > pg dump_pools_json >>>> > pg ls-by-pool <poolstr> [<states>...] >>>> > pg ls-by-primary <id|osd.id> [<pool:int>] [<states>...] >>>> > pg ls-by-osd <id|osd.id> [<pool:int>] [<states>...] >>>> > pg ls [<pool:int>] [<states>...] >>>> > pg dump_stuck [inactive|unclean|stale|undersized|degraded...] >>>> > [<threshold:int>] >>>> > Error EINVAL: invalid command >>>> > >>>> > >>>> > I imagine that it is related to the fact that we are running the >>>> Pacific >>>> > version and not the Quincy version. >>>> > >>>> > When looking at the Pacific's documentation ( >>>> > https://docs.ceph.com/en/pacific/cephfs/health-messages/), I should: >>>> > > Use the ops admin socket command to list outstanding metadata >>>> operations. >>>> > >>>> > Unfortunately, I fail to really understand what I'm supposed to do. >>>> Can >>>> > someone give a pointer? >>>> > >>>> > Best, >>>> > >>>> > Emmanuel >>>> > _______________________________________________ >>>> > ceph-users mailing list -- ceph-users@xxxxxxx >>>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>> > >>>> > >>>> >>>> -- >>>> Milind >>>> _______________________________________________ >>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>> >>> >> >> -- >> Milind >> >> -- Milind _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx