Re: Provide more documentation for MDS performance tuning on large file systems

Janek Bevendorff <janek.bevendorff@xxxxxxxxxxxxx> · Mon, 7 Dec 2020 10:39:35 +0100

What exactly do you set to 64k?
We used to set mds_max_caps_per_client to 50000, but once we started
using the tuned caps recall config, we reverted that back to the
default 1M without issue.

mds_max_caps_per_client. As I mentioned, some clients hit this limit 
regularly and they aren't entirely idle. I will keep tuning the recall 
settings, though.

This 15k caps client I mentioned is not related to the max caps per
client config. In recent nautilus, the MDS will proactively recall
caps from idle clients -- so a client with even just a few caps like
this can provoke the caps recall warnings (if it is buggy, like in
this case). The client doesn't cause any real problems, just the
annoying warnings.

We only see the warnings during normal operation. I remember having 
massive issues with early Nautilus releases, but thanks to more 
aggressive recall behaviour in newer releases, that is fixed. Back then 
it was virtually impossible to keep the MDS within the bounds of its 
memory limit. Nowadays, the warnings only appear when the MDS is really 
stressed. In that situation, the whole FS performance is already 
degraded massively and MDSs are likely to fail and run into the rejoin loop.

Multi-active + pinning definitely increases the overall MD throughput
(once you can get the relevant inodes cached), because as you know the
MDS is single threaded and CPU bound at the limit.
We could get something like 4-5k handle_client_requests out of a
single MDS, and that really does scale horizontally as you add MDSs
(and pin).

Okay, I will definitely re-evaluate options for pinning individual 
directories, perhaps a small script can do it.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx