MDS Behind on Trimming...

Hi All,

I've been battling this for a while and I'm not sure where to go from here. I have a Ceph health warning as such:

# ceph -s
    id:     58bde08a-d7ed-11ee-9098-506b4b4da440
    health: HEALTH_WARN
            1 MDSs report slow requests
            1 MDSs behind on trimming

mon: 5 daemons, quorum pr-md-01,pr-md-02,pr-store-01,pr-store-02,pr-md-03 (age 5d)
    mgr: pr-md-01.jemmdf(active, since 3w), standbys: pr-md-02.emffhz
    mds: 1/1 daemons up, 2 standby
    osd: 46 osds: 46 up (since 9h), 46 in (since 2w)

    volumes: 1/1 healthy
    pools:   4 pools, 1313 pgs
    objects: 260.72M objects, 466 TiB
    usage:   704 TiB used, 424 TiB / 1.1 PiB avail
    pgs:     1306 active+clean
             4    active+clean+scrubbing+deep
             3    active+clean+scrubbing

    client:   123 MiB/s rd, 75 MiB/s wr, 109 op/s rd, 1.40k op/s wr

And the specifics are:

# ceph health detail
HEALTH_WARN 1 MDSs report slow requests; 1 MDSs behind on trimming
[WRN] MDS_SLOW_REQUEST: 1 MDSs report slow requests 99 slow requests are blocked > 30 secs
[WRN] MDS_TRIM: 1 MDSs behind on trimming Behind on trimming (13884/250) max_segments: 250, num_segments: 13884

That "num_segments" number slowly keeps increasing. I suspect I just need to tell the MDS servers to trim faster but after hours of googling around I just can't figure out the best way to do it. The best I could come up with was to decrease "mds_cache_trim_decay_rate" from 1.0 to .8 (to start), based on this page:

But it doesn't seem to help, maybe I should decrease it further? I am guessing this must be a common issue...? I am running Reef on the MDS servers, but most clients are on Quincy.

Thanks for any advice!

