Re: Ceph FS - MDS problem

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Fri, 3 Jul 2015 11:34:36 +0200

Hi,

We're looking at similar issues here and I was composing a mail just
as you sent this. I'm just a user -- hopefully a dev will correct me
where I'm wrong.

1. A CephFS cap is a way to delegate permission for a client to do IO
with a file knowing that other clients are not also accessing that
file. These caps need to be tracked so they can be later revoked for
other clients to access files. (I didn't find a doc on CephFS caps, so
this is a guess and probably wrong).

2. If you set debug_mds = 3 you can see memory usage and how many caps
are delegated in total. Here's an example:

mds.0.cache check_memory_usage total 7988108, rss 7018088, heap
-457420, malloc -1747875 mmap 0, baseline -457420, buffers 0, max
1048576, 332739 / 332812 inodes have caps, 335839 caps, 1.0091 caps
per inode

It seems there is an int overflow for the heap and malloc measures on
our server :(

Anyway, once the MDS has delegated I think 90% of its max caps it will
start asking clients to give some back. If those clients don't release
caps, or don't release them fast enough you'll see...

3. "failing to respond to capability release" and "failing to respond
to cache pressure" can be caused by two different things: an old
client -- maybe 3.14 is too old like Wido said -- or a busy client. We
have a trivial bash script that creates many small files in a loop.
This client is grabbing new caps faster than it can release them.

3.b BTW, our old friend updatedb seems to trigger the same problem..
grabbing caps very quickly as it indexes CephFS. updatedb.conf is
configured to PRUNEFS="... fuse ...", but CephFS has type
fuse.ceph-fuse. We'll need to add "ceph" to that list too.

4. "mds cache size = 5000000" is going to use a lot of memory! We have
an MDS with just 8GB of RAM and it goes OOM after delegating  around 1
million caps. (this is with mds cache size = 100000, btw)

4.b. "mds cache size" is used for more than one purpose .. it sets the
size of the MDS LRU _and_ it sets the maximum number of client caps.
Those seem like two completely different things... why is it the same
config option?!!!

For me there are still a couple things missing related to CephFS caps
and memory usage::
  - a hard limit on the number of caps per client (to prevent a
busy/broken client from DOS'ing the MDS)
  - an automatic way to forcably revoke caps from a misbehaving
client, e.g. revoke and put a client into RO or even no-IO mode
  - AFAICT, "mds mem max" has been unused since before argonaut -- we
should remove that completely since it is confusing (PR incoming...)
  - the MDS should eventually auto-tune the mds cache size to fit the
amount of available memory.

Best Regards,

Dan

On Fri, Jul 3, 2015 at 10:25 AM, Mathias Buresch
<mathias.buresch@xxxxxxxxxxxx> wrote:
> Hi there,
>
> maybe you could be so kind and help me with following issue:
>
> We running Ceph FS but there's repeatedly a problem with the MDS.
>
> Sometimes following error occurs: "mds0: Client 701782 failing to respond to
> capability release"
> Listing the session informations shows that the "num_caps" on that Client is
> much more than on the other Clients. ( see also -> attachement )
>
> The problem is that the load on one of the server is increasing to really
> high value ( 80 to 100 ) independent of client which is complaining.
>
> I guess my problem is also that I dont really understand the meaning of
> those "capabilties".
>
> Following facts (let me know if you need more):
>
> CEPH-FS-Client, MDS, MON, OSD all on same server
> Kernel-Client (Kernel: 3.14.16-031416-generic)
> MDS config
>
> only raised "mds cache size = 5000000"  (because before there was error
> "failing to respond to cache pressure")
>
>
> Best regards
> Mathias
>
>
>
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com