I encountered the exact same issue earlier today immediately after upgrading a customer's cluster from 12.2.2 to 12.2.5.
I've evicted the session and restarted the ganesha client to fix it, as I also couldn't find any obvious problem.
Paul
2018-05-28 16:38 GMT+02:00 Oliver Freyermuth <freyermuth@xxxxxxxxxxxxxxxxxx>:
Dear Cephalopodians,
we just had a "lockup" of many MDS requests, and also trimming fell behind, for over 2 days.
One of the clients (all ceph-fuse 12.2.5 on CentOS 7.5) was in status "currently failed to authpin local pins". Metadata pool usage did grow by 10 GB in those 2 days.
Rebooting the node to force a client eviction solved the issue, and now metadata usage is down again, and all stuck requests were processed quickly.
Is there any idea on what could cause something like that? On the client, der was no CPU load, but many processes waiting for cephfs to respond.
Syslog did yield anything. It only affected one user and his user directory.
If there are no ideas: How can I collect good debug information in case this happens again?
Cheers,
Oliver
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com