Re: mds0: Client failing to respond to cache pressure

Eric Eastman <eric.eastman@xxxxxxxxxxxxxx> · Mon, 13 Jul 2015 09:40:24 -0600

Thanks John. I will back the test down to the simple case of 1 client without the kernel driver and only running NFS Ganesha, and work forward till I trip the problem and report my findings.
Eric

On Mon, Jul 13, 2015 at 2:18 AM, John Spray <john.spray@xxxxxxxxxx> wrote:

On 13/07/2015 04:02, Eric Eastman wrote:

Hi John,

I am seeing this problem with Ceph v9.0.1 with the v4.1 kernel on all

nodes.  This system is using 4 Ceph FS client systems. They all have

the kernel driver version of CephFS loaded, but none are mounting the

file system. All 4 clients are using the libcephfs VFS interface to

Ganesha NFS (V2.2.0-2) and Samba (Version 4.3.0pre1-GIT-0791bb0) to

share out the Ceph file system.

# ceph -s

     cluster 6d8aae1e-1125-11e5-a708-001b78e265be

      health HEALTH_WARN

             4 near full osd(s)

             mds0: Client ede-c2-gw01 failing to respond to cache pressure

             mds0: Client ede-c2-gw02:cephfs failing to respond to cache pressure

             mds0: Client ede-c2-gw03:cephfs failing to respond to cache pressure

      monmap e1: 3 mons at

{ede-c2-mon01=10.15.2.121:6789/0,ede-c2-mon02=10.15.2.122:6789/0,ede-c2-mon03=10.15.2.123:6789/0}

             election epoch 8, quorum 0,1,2

ede-c2-mon01,ede-c2-mon02,ede-c2-mon03

      mdsmap e912: 1/1/1 up {0=ede-c2-mds03=up:active}, 2 up:standby

      osdmap e272: 8 osds: 8 up, 8 in

       pgmap v225264: 832 pgs, 4 pools, 188 GB data, 5173 kobjects

             212 GB used, 48715 MB / 263 GB avail

                  832 active+clean

   client io 1379 kB/s rd, 20653 B/s wr, 98 op/s

It would help if we knew whether it's the kernel clients or the userspace clients that are generating the warnings here.  You've probably already done this, but I'd get rid of any unused kernel client mounts to simplify the situation.

We haven't tested the cache limit enforcement with NFS Ganesha, so there is a decent chance that it is broken.  The ganehsha FSAL is doing ll_get/ll_put reference counting on inodes, so it seems quite possible that its cache is pinning things that we would otherwise be evicting in response to cache pressure.  You mention samba as well,

You can see if the MDS cache is indeed exceeding its limit by looking at the output of:

ceph daemon mds.<daemon id> perf dump mds

...where the "inodes" value tells you how many are in the cache, vs. inode_max.

If you can, it would be useful to boil this down to a straightforward test case: if you start with a healthy cluster, mount a single ganesha client, and do your 5 million file procedure, do you get the warning?  Same for samba/kernel mounts -- this is likely to be a client side issue, so we need to confirm which client is misbehaving.

Cheers,

John

# cat /proc/version

Linux version 4.1.0-040100-generic (kernel@gomeisa) (gcc version 4.6.3

(Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201506220235 SMP Mon Jun 22 06:36:19

UTC 2015

# ceph -v

ceph version 9.0.1 (997b3f998d565a744bfefaaf34b08b891f8dbf64)

The systems are all running Ubuntu Trusty that has been upgraded to

the 4.1 kernel. This is all physical machines and no VMs.  The test

run that caused the problem was create and verifying 5 million small

files.

We have some tools that flag when Ceph is in a WARN state so it would

be nice to get rid of this warning.

Please let me know what additional information you need.

Thanks,

Eric

On Fri, Jul 10, 2015 at 4:19 AM, 谷枫 <feicheche@xxxxxxxxx> wrote:

Thank you John,

All my server is ubuntu14.04 with 3.16 kernel.

Not all of clients appear this problem, the cluster seems functioning well

now.

As you say,i will change the mds_cache_size to 500000 from 100000 to take a

test, thanks again!

2015-07-10 17:00 GMT+08:00 John Spray <john.spray@xxxxxxxxxx>:

This is usually caused by use of older kernel clients.  I don't remember

exactly what version it was fixed in, but iirc we've seen the problem with

3.14 and seen it go away with 3.18.

If your system is otherwise functioning well, this is not a critical error

-- it just means that the MDS might not be able to fully control its memory

usage (i.e. it can exceed mds_cache_size).

John

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com