On Wed, May 30, 2018 at 3:04 PM, Oliver Freyermuth <freyermuth@xxxxxxxxxxxxxxxxxx> wrote: > Hi, > > ij our case, there's only a single active MDS > (+1 standby-replay + 1 standby). > We also get the health warning in case it happens. > Were there "client.xxx isn't responding to mclientcaps(revoke)" warnings in cluster log. please send them to me if there were. > Cheers, > Oliver > > Am 30.05.2018 um 03:25 schrieb Yan, Zheng: >> I could be http://tracker.ceph.com/issues/24172 >> >> >> On Wed, May 30, 2018 at 9:01 AM, Linh Vu <vul@xxxxxxxxxxxxxx> wrote: >>> In my case, I have multiple active MDS (with directory pinning at the very >>> top level), and there would be "Client xxx failing to respond to capability >>> release" health warning every single time that happens. >>> >>> ________________________________ >>> From: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> on behalf of Yan, Zheng >>> <ukernel@xxxxxxxxx> >>> Sent: Tuesday, 29 May 2018 9:53:43 PM >>> To: Oliver Freyermuth >>> Cc: Ceph Users; Peter Wienemann >>> Subject: Re: Ceph-fuse getting stuck with "currently failed to >>> authpin local pins" >>> >>> Single or multiple acitve mds? Were there "Client xxx failing to >>> respond to capability release" health warning? >>> >>> On Mon, May 28, 2018 at 10:38 PM, Oliver Freyermuth >>> <freyermuth@xxxxxxxxxxxxxxxxxx> wrote: >>>> Dear Cephalopodians, >>>> >>>> we just had a "lockup" of many MDS requests, and also trimming fell >>>> behind, for over 2 days. >>>> One of the clients (all ceph-fuse 12.2.5 on CentOS 7.5) was in status >>>> "currently failed to authpin local pins". Metadata pool usage did grow by 10 >>>> GB in those 2 days. >>>> >>>> Rebooting the node to force a client eviction solved the issue, and now >>>> metadata usage is down again, and all stuck requests were processed quickly. >>>> >>>> Is there any idea on what could cause something like that? On the client, >>>> der was no CPU load, but many processes waiting for cephfs to respond. >>>> Syslog did yield anything. It only affected one user and his user >>>> directory. >>>> >>>> If there are no ideas: How can I collect good debug information in case >>>> this happens again? >>>> >>>> Cheers, >>>> Oliver >>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx >>>> >>>> https://protect-au.mimecast.com/s/Zl9aCXLKNwFxY9nNc6jQJC?domain=lists.ceph.com >>>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com