Re: After Luminous upgrade: ceph-fuse clients failing to respond to cache pressure

Andras Pataki <apataki@xxxxxxxxxxxxxxxxxxxxx> · Wed, 17 Jan 2018 10:36:59 -0500

Hi John,

All our hosts are CentOS 7 hosts, the majority are 7.4 with kernel 
3.10.0-693.5.2.el7.x86_64, with fuse 2.9.2-8.el7.  We have some hosts 
that have slight variations in kernel versions, the oldest one are a 
handful of CentOS 7.3 hosts with kernel 3.10.0-514.21.1.el7.x86_64 and 
fuse 2.9.2-7.el7.  I know Redhat has been backporting lots of stuff so 
perhaps these kernels fall into the category you are describing?

When the cache pressure problem happens, is there a way to know exactly 
which hosts are involved, and what items are in their caches easily?

Andras

On 01/17/2018 06:09 AM, John Spray wrote:
On Tue, Jan 16, 2018 at 8:50 PM, Andras Pataki
<apataki@xxxxxxxxxxxxxxxxxxxxx> wrote:
Dear Cephers,

We've upgraded the back end of our cluster from Jewel (10.2.10) to Luminous
(12.2.2).  The upgrade went smoothly for the most part, except we seem to be
hitting an issue with cephfs.  After about a day or two of use, the MDS
start complaining about clients failing to respond to cache pressure:
What's the OS, kernel version and fuse version on the hosts where the
clients are running?

There have been some issues with ceph-fuse losing the ability to
properly invalidate cached items when certain updated OS packages were
installed.

Specifically, ceph-fuse checks the kernel version against 3.18.0 to
decide which invalidation method to use, and if your OS has backported
new behaviour to a low-version-numbered kernel, that can confuse it.

John

[root@cephmon00 ~]# ceph -s
   cluster:
     id:     d7b33135-0940-4e48-8aa6-1d2026597c2f
     health: HEALTH_WARN
             1 MDSs have many clients failing to respond to cache pressure
             noout flag(s) set
             1 osds down

   services:
     mon: 3 daemons, quorum cephmon00,cephmon01,cephmon02
     mgr: cephmon00(active), standbys: cephmon01, cephmon02
     mds: cephfs-1/1/1 up  {0=cephmon00=up:active}, 2 up:standby
     osd: 2208 osds: 2207 up, 2208 in
          flags noout

   data:
     pools:   6 pools, 42496 pgs
     objects: 919M objects, 3062 TB
     usage:   9203 TB used, 4618 TB / 13822 TB avail
     pgs:     42470 active+clean
              22    active+clean+scrubbing+deep
              4     active+clean+scrubbing

   io:
     client:   56122 kB/s rd, 18397 kB/s wr, 84 op/s rd, 101 op/s wr

[root@cephmon00 ~]# ceph health detail
HEALTH_WARN 1 MDSs have many clients failing to respond to cache pressure;
noout flag(s) set; 1 osds down
MDS_CLIENT_RECALL_MANY 1 MDSs have many clients failing to respond to cache
pressure
     mdscephmon00(mds.0): Many clients (103) failing to respond to cache
pressureclient_count: 103
OSDMAP_FLAGS noout flag(s) set
OSD_DOWN 1 osds down
     osd.1296 (root=root-disk,pod=pod0-disk,host=cephosd008-disk) is down

We are using exclusively the 12.2.2 fuse client on about 350 nodes or so
(out of which it seems 100 are not responding to cache pressure in this
log).  When this happens, clients appear pretty sluggish also (listing
directories, etc.).  After bouncing the MDS, everything returns on normal
after the failover for a while.  Ignore the message about 1 OSD down, that
corresponds to a failed drive and all data has been re-replicated since.

We were also using the 12.2.2 fuse client with the Jewel back end before the
upgrade, and have not seen this issue.

We are running with a larger MDS cache than usual, we have mds_cache_size
set to 4 million.  All other MDS configs are the defaults.

Is this a known issue?  If not, any hints on how to further diagnose the
problem?

Andras

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com