Ah, I see. Yes, we are already running version 18.2.1 on the server side (we just installed this cluster a few weeks ago from scratch). So I guess if the fix has already been backported to that version, then we still have a problem. Dos that mean it could be the locker order bug (https://tracker.ceph.com/issues/62123) as Xiubo suggested? Thanks again, Erich > On Apr 7, 2024, at 9:00 PM, Alexander E. Patrakov <patrakov@xxxxxxxxx> wrote: > > Hi Erich, > >> On Mon, Apr 8, 2024 at 11:51 AM Erich Weiler <weiler@xxxxxxxxxxxx> wrote: >> >> Hi Xiubo, >> >>> Thanks for your logs, and it should be the same issue with >>> https://tracker.ceph.com/issues/62052, could you try to test with this >>> fix again ? >> >> This sounds good - but I'm not clear on what I should do? I see a patch >> in that tracker page, is that what you are referring to? If so, how >> would I apply such a patch? Or is there simply a binary update I can >> apply somehow to the MDS server software? > > The backport of this patch (https://github.com/ceph/ceph/pull/53241) > was merged on October 18, 2023, and Ceph 18.2.1 was released on > December 18, 2023. Therefore, if you are running Ceph 18.2.1 on the > server side, you already have the fix. If you are already running > version 18.2.1 or 18.2.2 (to which you should upgrade anyway), please > complain, as the purported fix is then ineffective. > >> >> Thanks for helping! >> >> -erich >> >>> Please let me know if you still could see this bug then it should be the >>> locker order bug as https://tracker.ceph.com/issues/62123. >>> >>> Thanks >>> >>> - Xiubo >>> >>> >>> On 3/28/24 04:03, Erich Weiler wrote: >>>> Hi All, >>>> >>>> I've been battling this for a while and I'm not sure where to go from >>>> here. I have a Ceph health warning as such: >>>> >>>> # ceph -s >>>> cluster: >>>> id: 58bde08a-d7ed-11ee-9098-506b4b4da440 >>>> health: HEALTH_WARN >>>> 1 MDSs report slow requests >>>> 1 MDSs behind on trimming >>>> >>>> services: >>>> mon: 5 daemons, quorum >>>> pr-md-01,pr-md-02,pr-store-01,pr-store-02,pr-md-03 (age 5d) >>>> mgr: pr-md-01.jemmdf(active, since 3w), standbys: pr-md-02.emffhz >>>> mds: 1/1 daemons up, 2 standby >>>> osd: 46 osds: 46 up (since 9h), 46 in (since 2w) >>>> >>>> data: >>>> volumes: 1/1 healthy >>>> pools: 4 pools, 1313 pgs >>>> objects: 260.72M objects, 466 TiB >>>> usage: 704 TiB used, 424 TiB / 1.1 PiB avail >>>> pgs: 1306 active+clean >>>> 4 active+clean+scrubbing+deep >>>> 3 active+clean+scrubbing >>>> >>>> io: >>>> client: 123 MiB/s rd, 75 MiB/s wr, 109 op/s rd, 1.40k op/s wr >>>> >>>> And the specifics are: >>>> >>>> # ceph health detail >>>> HEALTH_WARN 1 MDSs report slow requests; 1 MDSs behind on trimming >>>> [WRN] MDS_SLOW_REQUEST: 1 MDSs report slow requests >>>> mds.slugfs.pr-md-01.xdtppo(mds.0): 99 slow requests are blocked > >>>> 30 secs >>>> [WRN] MDS_TRIM: 1 MDSs behind on trimming >>>> mds.slugfs.pr-md-01.xdtppo(mds.0): Behind on trimming (13884/250) >>>> max_segments: 250, num_segments: 13884 >>>> >>>> That "num_segments" number slowly keeps increasing. I suspect I just >>>> need to tell the MDS servers to trim faster but after hours of >>>> googling around I just can't figure out the best way to do it. The >>>> best I could come up with was to decrease "mds_cache_trim_decay_rate" >>>> from 1.0 to .8 (to start), based on this page: >>>> >>>> https://www.suse.com/support/kb/doc/?id=000019740 >>>> >>>> But it doesn't seem to help, maybe I should decrease it further? I am >>>> guessing this must be a common issue...? I am running Reef on the MDS >>>> servers, but most clients are on Quincy. >>>> >>>> Thanks for any advice! >>>> >>>> cheers, >>>> erich >>>> _______________________________________________ >>>> ceph-users mailing list -- ceph-users@xxxxxxx >>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>> >>> >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > -- > Alexander E. Patrakov _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx