Re: failing to respond to cache pressure

Eugen Block <eblock@xxxxxx> · Mon, 13 Aug 2018 13:47:48 +0000

Hi,

Depending on your kernel (memory leaks with CephFS) increasing the  
mds_cache_memory_limit could be of help. What is your current setting  
now?

ceph:~ # ceph daemon mds.<MDS> config show | grep mds_cache_memory_limit

We had these messages for months, almost every day.
It would occur when hourly backup jobs ran and the MDS had to serve an  
additional client (searching the whole CephFS for changes) besides the  
existing CephFS clients. First we updated all clients to a more recent  
kernel version, but the warnings didn't stop. Then we doubled the  
cache size from 2 GB to 4 GB last week and since then I haven't seen  
this warning again (for now).

Try playing with the cache size to find a setting fitting your needs,  
but don't forget to monitor your MDS in case something goes wrong.

Regards,
Eugen

Zitat von Wido den Hollander <wido@xxxxxxxx>:

On 08/13/2018 01:22 PM, Zhenshi Zhou wrote:
Hi,
Recently, the cluster runs healthy, but I get warning messages everyday:

Which version of Ceph? Which version of clients?

Can you post:

$ ceph versions
$ ceph features
$ ceph fs status

Wido

2018-08-13 17:39:23.682213 [INF]  Cluster is now healthy
2018-08-13 17:39:23.682144 [INF]  Health check cleared:
MDS_CLIENT_RECALL (was: 6 clients failing to respond to cache pressure)
2018-08-13 17:39:23.052022 [INF]  MDS health message cleared (mds.0):
Client docker38:docker failing to respond to cache pressure
2018-08-13 17:39:23.051979 [INF]  MDS health message cleared (mds.0):
Client docker73:docker failing to respond to cache pressure
2018-08-13 17:39:23.051934 [INF]  MDS health message cleared (mds.0):
Client docker74:docker failing to respond to cache pressure
2018-08-13 17:39:23.051853 [INF]  MDS health message cleared (mds.0):
Client docker75:docker failing to respond to cache pressure
2018-08-13 17:39:23.051815 [INF]  MDS health message cleared (mds.0):
Client docker27:docker failing to respond to cache pressure
2018-08-13 17:39:23.051753 [INF]  MDS health message cleared (mds.0):
Client docker27 failing to respond to cache pressure
2018-08-13 17:38:11.100331 [WRN]  Health check update: 6 clients failing
to respond to cache pressure (MDS_CLIENT_RECALL)
2018-08-13 17:37:39.570014 [WRN]  Health check update: 5 clients failing
to respond to cache pressure (MDS_CLIENT_RECALL)
2018-08-13 17:37:31.099418 [WRN]  Health check update: 3 clients failing
to respond to cache pressure (MDS_CLIENT_RECALL)
2018-08-13 17:36:34.564345 [WRN]  Health check update: 1 clients failing
to respond to cache pressure (MDS_CLIENT_RECALL)
2018-08-13 17:36:27.121891 [WRN]  Health check update: 3 clients failing
to respond to cache pressure (MDS_CLIENT_RECALL)
2018-08-13 17:36:11.967531 [WRN]  Health check update: 5 clients failing
to respond to cache pressure (MDS_CLIENT_RECALL)
2018-08-13 17:35:59.870055 [WRN]  Health check update: 6 clients failing
to respond to cache pressure (MDS_CLIENT_RECALL)
2018-08-13 17:35:47.787323 [WRN]  Health check update: 3 clients failing
to respond to cache pressure (MDS_CLIENT_RECALL)
2018-08-13 17:34:59.435933 [WRN]  Health check failed: 1 clients failing
to respond to cache pressure (MDS_CLIENT_RECALL)
2018-08-13 17:34:59.045510 [WRN]  MDS health message (mds.0): Client
docker75:docker failing to respond to cache pressure  

How can I fix it?

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com