Hi,
I have dealt with this topic multiple times, the SUSE team helped
understanding what's going on under the hood. The summary can be found
in this thread [1].
What helped in our case was to reduce the mds_recall_max_caps from 30k
(default) to 3k. We tried it in steps of 1k IIRC. So I suggest to
reduce that value step by step (maybe start with 20k or something) to
find the optimal value.
Regards,
Eugen
[1] https://www.spinics.net/lists/ceph-users/msg73188.html
Zitat von Özkan Göksu <ozkangksu@xxxxxxxxx>:
Hello.
I have 5 node ceph cluster and I'm constantly having "clients failing to
respond to cache pressure" warning.
I have 84 cephfs kernel clients (servers) and my users are accessing their
personal subvolumes located on one pool.
My users are software developers and the data is home and user data. (Git,
python projects, sample data and generated new data)
---------------------------------------------------------------------------------
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
ssd 146 TiB 101 TiB 45 TiB 45 TiB 30.71
TOTAL 146 TiB 101 TiB 45 TiB 45 TiB 30.71
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
.mgr 1 1 356 MiB 90 1.0 GiB 0 30 TiB
cephfs.ud-data.meta 9 256 69 GiB 3.09M 137 GiB 0.15 45 TiB
cephfs.ud-data.data 10 2048 26 TiB 100.83M 44 TiB 32.97 45 TiB
---------------------------------------------------------------------------------
root@ud-01:~# ceph fs status
ud-data - 84 clients
=======
RANK STATE MDS ACTIVITY DNS INOS DIRS
CAPS
0 active ud-data.ud-04.seggyv Reqs: 142 /s 2844k 2798k 303k
720k
POOL TYPE USED AVAIL
cephfs.ud-data.meta metadata 137G 44.9T
cephfs.ud-data.data data 44.2T 44.9T
STANDBY MDS
ud-data.ud-02.xcoojt
ud-data.ud-05.rnhcfe
ud-data.ud-03.lhwkml
ud-data.ud-01.uatjle
MDS version: ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5)
quincy (stable)
-----------------------------------------------------------------------------------
My MDS settings are below:
mds_cache_memory_limit | 8589934592
mds_cache_trim_threshold | 524288
mds_recall_global_max_decay_threshold | 131072
mds_recall_max_caps | 30000
mds_recall_max_decay_rate | 1.500000
mds_recall_max_decay_threshold | 131072
mds_recall_warning_threshold | 262144
I have 2 questions:
1- What should I do to prevent cache pressue warning ?
2- What can I do to increase speed ?
- Thanks
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx