Re: MDS behind on trimming every 4-5 weeks causing issue for ceph filesystem

Kotresh Hiremath Ravishankar <khiremat@xxxxxxxxxx> · Fri, 17 May 2024 12:11:04 +0530

Hi,

~6K log segments to be trimmed, that's huge.

1. Are there any custom configs configured on this setup ?
2. Is subtree pinning enabled ?
3. Are there any warnings w.r.t rados slowness ?
4. Please share the mds perf dump to check for latencies and other stuff.
   $ceph tell mds.<id> perf dump

Thanks and Regards,
Kotresh H R

On Fri, May 17, 2024 at 11:01 AM Akash Warkhade <a.warkhade98@xxxxxxxxx>
wrote:

> Hi,
>
> We are using rook-ceph with operator 1.10.8 and ceph 17.2.5.
> we are using ceph filesystem with 4 mds i.e 2 active & 2 standby MDS
> every 3-4 weeks filesystem is having issue i.e in ceph status we can see
> below warnings warnings :
>
> 2 MDS reports slow requests
> 2 MDS Behind on Trimming
> mds.myfs-a(mds.1) : behind on trimming (6378/128) max_segments:128,
> num_segments: 6378
> mds.myfs-c(mds.1):  behind on trimming (6560/128) max_segments:128,
> num_segments: 6560
>
> to fix it, we have to restart all MDS pods one by one.
> this is happening every 4-5 weeks.
>
> We have seen many ceph issues related to it on ceph tracker and many people
> are suggesting to increase mds_cache_memory_limit
> currently for our cluster *mds_cache_memory_limit* is set to default 4GB
> *mds_log_max_segments* is set to default 128
> Should we increase *mds_cache_memory_limit* to 8GB from default 4GB or is
> there any solution to fix this issue permanently?
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx