+ other ceph-users On Wed, Jul 24, 2019 at 10:26 AM Janek Bevendorff <janek.bevendorff@xxxxxxxxxxxxx> wrote: > > > what's the ceph.com mailing list? I wondered whether this list is dead but it's the list announced on the official ceph.com homepage, isn't it? > There are two mailing lists announced on the website. If you go to > https://ceph.com/resources/ you will find the > subscribe/unsubscribe/archive links for the (much more active) ceph.com > MLs. But if you click on "Mailing Lists & IRC page" you will get to a > page where you can subscribe to this list, which is different. Very > confusing. It is confusing. This is supposed to be the new ML but I don't think the migration has started yet. > > What did you have the MDS cache size set to at the time? > > > > < and an inode count between > > I actually did not think I'd get a reply here. We are a bit further than > this on the other mailing list. This is the thread: > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-July/036095.html > > To sum it up: the ceph client prevents the MDS from freeing its cache, > so inodes keep piling up until either the MDS becomes too slow (fixable > by increasing the beacon grace time) or runs out of memory. The latter > will happen eventually. In the end, my MDSs couldn't even rejoin because > they hit the host's 128GB memory limit and crashed. It's possible the MDS is not being aggressive enough with asking the single (?) client to reduce its cache size. There were recent changes [1] to the MDS to improve this. However, the defaults may not be aggressive enough for your client's workload. Can you try: ceph config set mds mds_recall_max_caps 10000 ceph config set mds mds_recall_max_decay_rate 1.0 Also your other mailings made me think you may still be using the old inode limit for the cache size. Are you using the new mds_cache_memory_limit config option? Finally, if this fixes your issue (please let us know!) and you decide to try multiple active MDS, you should definitely use pinning as the parallel create workload will greatly benefit from it. [1] https://ceph.com/community/nautilus-cephfs/ -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com