On Mon, Dec 7, 2020 at 10:39 AM Janek Bevendorff <janek.bevendorff@xxxxxxxxxxxxx> wrote: > > > > What exactly do you set to 64k? > > We used to set mds_max_caps_per_client to 50000, but once we started > > using the tuned caps recall config, we reverted that back to the > > default 1M without issue. > > mds_max_caps_per_client. As I mentioned, some clients hit this limit > regularly and they aren't entirely idle. I will keep tuning the recall > settings, though. > > > This 15k caps client I mentioned is not related to the max caps per > > client config. In recent nautilus, the MDS will proactively recall > > caps from idle clients -- so a client with even just a few caps like > > this can provoke the caps recall warnings (if it is buggy, like in > > this case). The client doesn't cause any real problems, just the > > annoying warnings. > > We only see the warnings during normal operation. I remember having > massive issues with early Nautilus releases, but thanks to more > aggressive recall behaviour in newer releases, that is fixed. Back then > it was virtually impossible to keep the MDS within the bounds of its > memory limit. Nowadays, the warnings only appear when the MDS is really > stressed. In that situation, the whole FS performance is already > degraded massively and MDSs are likely to fail and run into the rejoin loop. > > > Multi-active + pinning definitely increases the overall MD throughput > > (once you can get the relevant inodes cached), because as you know the > > MDS is single threaded and CPU bound at the limit. > > We could get something like 4-5k handle_client_requests out of a > > single MDS, and that really does scale horizontally as you add MDSs > > (and pin). > > Okay, I will definitely re-evaluate options for pinning individual > directories, perhaps a small script can do it. There is a new ephemeral pinning option in the latest latest releases, but we didn't try it yet. Here's our script -- it assumes the parent dir is pinned to zero or that bal is disabled: https://github.com/cernceph/ceph-scripts/blob/master/tools/cephfs/cephfs-bal-shard Too many pins can cause problems -- we have something like 700 pins at the moment and it's fine, though. Cheers, Dan > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx