On Tue, Jan 17, 2017 at 10:07 AM, Darrell Enns <darrelle@xxxxxxxxxxxx> wrote: > I’ve just had one of my cephfs servers showing an “mdsY: Client XXXXX > failing to respond to capability release” error. The client in question was > acting strange, not allowing deleting files, etc. The issue was cleared by > restarting the affected server. I see there have been a few posts about this > – perhaps related to mds cache size. Does anyone know if there is some > tuning that can be done to prevent this from happening? Or is this a bug? I > do have plenty of RAM available to increase mds cache size if necessary – > it’s currently just left at the default value. Is there a tuning guide for > MDS? I can’t seem to find any recommendations in the docs. With what you've said it could be any number of things. Next time this happens, use the MDS admin socket commands to investigate "ops" that are in flight and see if the client has any stuck ones; "session ls" et al to see what the MDS is saying about the client, etc. It sounds like maybe you hit a kernel client bug, but we'd need a lot more detail to track it down. -Greg > > > > The cluster is running ceph 10.2.5 and I’m using the kernel cephfs client > (kernel version 4.9.0). Things I’ve already investigated: > > · Log files (kernel, syslog, etc) – nothing unusual at all > > · Historical graphs of CPU, Memory, Network, etc – nothing unusual, > plenty of resources available > > · Historical graphs of overall cluster load/IO – nothing out of the > ordinary > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com