Re: how to fix slow request without remote or restart mds

Stefan Kooman <stefan@xxxxxx> · Fri, 26 Aug 2022 13:13:15 +0200

On 8/26/22 12:33, zxcs wrote:
Hi, experts

we have a cephfs cluster with 15.2.* version and kernel mount, today there is a health report mds slow request as below, i checked this mds log, seems it report some slow request for a long time.

mds report:
1 MDSs report slow requests

mds log:
log_channel(cluster) log [WRN] : slow request 34616.878139 seconds old, received at 2022-08-26T08:49:16.400430+0800: client_request(client.100807545:2601765 getattr

i know we can restart mds to fix this(may be), but seems there only one directory hang, called A, (means when i ls -lrth /ceph/path/A, it stuck), and list other directory no issue.
Might be this bug: https://tracker.ceph.com/issues/50840
We hit this bug. A restart of the MDS is necessary. What version of 
Octopus do you run? This is fixed in Octopus 15.2.17 [1,2].

So if you hit this bug (which might be difficult to tell) you can update 
the MDS and restart the MDS with the new version.

my question is how can we fix this without remount ceph on this node or restart mds (this will impact other uses).

Gr. Stefan

[1]: https://docs.ceph.com/en/latest/releases/octopus/#changelog
[2]: https://tracker.ceph.com/issues/51202
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx