Re: Client XXX failing to respond to cache pressure

Gregory Farnum <gfarnum@xxxxxxxxxx> · Thu, 8 Sep 2016 16:34:50 -0700



On Thu, Sep 8, 2016 at 5:59 AM, Georgi Chorbadzhiyski
<georgi.chorbadzhiyski@xxxxxxxxx> wrote:
> Today I was surprised to find our cluster in HEALTH_WARN condition and
> searching in documentation was no help at all.
>
> Does anybody have an idea how to cure the dreaded "failing to respond
> to cache pressure" message. As I understand it, it tells me that a
> client is not responding to MDS request to prune it's cache but
> I have no idea what is causing the problem and how to cure it.
>
> I'm using kernel cephfs driver on kernel 4.4.14.

You probably want to search the list archives at gmane et al for this.
You should check to see how many files the clients are actually using
compared to what they hold caps on (you can check caps via the admin
socket); it might just be that the amount of in-use data is higher
than your MDS cache size (100k by default, but you probably have
enough memory to increase it by one or two orders of magnitude).
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com