[WRN] Health check failed: 1 MDSs report slow requests (MDS_SLOW_REQUEST)cluster [INF] Health check cleared: FS_DEGRADED (was: 1 filesystem is degraded)
: cluster [DBG] mds.? [v2:10.100.190.39:6800/2624951349,v1:10.100.190.39:6801/2624951349] up:rejoin 2021-05-26 10:55:33.215102 mon.ceph2mon01 (mon.0) 700 : cluster [DBG] fsmap nxtclfs:2/2 {0=ceph2mon03=up:rejoin,1=ceph2mon01=up:active} 1 up:standby
Degrading the filesystem and I have assumed that the problem is due to the memory consumption of the MDS process, which can reach around 80% or more of the total memory.
El 26/5/21 a las 13:21, Dan van der Ster escribió:
I've seen your other thread. Using 78GB of RAM when the memory limit is set to 64GB is not highly unusual, and doesn't necessarily indicate any problem. It *would* be a problem if the MDS memory grows uncontrollably, however. Otherwise, check those new defaults for caps recall -- they were released around 14.2.19 IIRC. -- Dan On Wed, May 26, 2021 at 12:46 PM Andres Rojas Guerrero <a.rojas@xxxxxxx> wrote:Thanks for the answer. Yes, during these last weeks I have had memory consumption problems in the MDS nodes that led, at least it seemed to me, to performance problems in CephFS. I have been varying, for example: mds_cache_memory_limit mds_min_caps_per_client mds_health_cache_threshold mds_max_caps_per_client mds_cache_reservation But without much knowledge and with a trial and error procedure, i.e. observing how CephFS behaved when changing one of the parameters. Although I have achieved improvement the procedure does not convince me at all and that's why I was asking if there was something more reliable ... El 26/5/21 a las 12:15, Dan van der Ster escribió:Hi, The mds_cache_memory_limit should be set to something relative to the RAM size of the MDS -- maybe 50% is a good rule of thumb, because there are a few cases where the RSS can exceed this limit. Your experience will help guide what size you need (metadata pool IO activity will be really high if the MDS cache is too small) Otherwise, in recent releases of N/O/P the defaults for those settings you mentioned are quite good [1]; I would be surprised if they need further tuning for 99% of users. Is there any reason you want to start adjusting these params? Best Regards, Dan [1] https://github.com/ceph/ceph/pull/38574 On Wed, May 26, 2021 at 11:58 AM Andres Rojas Guerrero <a.rojas@xxxxxxx> wrote:Hi all, I have observed that the MDS Cache Configuration has 18 parameters: mds_cache_memory_limit mds_cache_reservation mds_health_cache_threshold mds_cache_trim_threshold mds_cache_trim_decay_rate mds_recall_max_caps mds_recall_max_decay_threshold mds_recall_max_decay_rate mds_recall_global_max_decay_threshold mds_recall_warning_threshold mds_recall_warning_decay_rate mds_session_cap_acquisition_throttle mds_session_cap_acquisition_decay_rate mds_session_max_caps_throttle_ratio mds_cap_acquisition_throttle_retry_request_timeout mds_session_cache_liveness_magnitude mds_session_cache_liveness_decay_rate mds_max_caps_per_client I find the Ceph documentation in this section a bit cryptic and I have tried to find some resources that talk about how to tune these parameters, but without success. Does anyone have experience in adjusting these parameters according to the characteristics of the Ceph cluster itself, the hardware and the use of MDS? Regards! -- ******************************************************* Andrés Rojas Guerrero Unidad Sistemas Linux Area Arquitectura Tecnológica Secretaría General Adjunta de Informática Consejo Superior de Investigaciones Científicas (CSIC) Pinar 19 28006 - Madrid Tel: +34 915680059 -- Ext. 990059 email: a.rojas@xxxxxxx ID comunicate.csic.es: @50852720l:matrix.csic.es ******************************************************* _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx-- ******************************************************* Andrés Rojas Guerrero Unidad Sistemas Linux Area Arquitectura Tecnológica Secretaría General Adjunta de Informática Consejo Superior de Investigaciones Científicas (CSIC) Pinar 19 28006 - Madrid Tel: +34 915680059 -- Ext. 990059 email: a.rojas@xxxxxxx ID comunicate.csic.es: @50852720l:matrix.csic.es *******************************************************
-- ******************************************************* Andrés Rojas Guerrero Unidad Sistemas Linux Area Arquitectura Tecnológica Secretaría General Adjunta de Informática Consejo Superior de Investigaciones Científicas (CSIC) Pinar 19 28006 - Madrid Tel: +34 915680059 -- Ext. 990059 email: a.rojas@xxxxxxx ID comunicate.csic.es: @50852720l:matrix.csic.es *******************************************************
Attachment:
OpenPGP_0x2DEE9321B16B4A68.asc
Description: OpenPGP public key
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx