On Fri, Jun 9, 2023 at 3:27 AM Janek Bevendorff <janek.bevendorff@xxxxxxxxxxxxx> wrote: > > Hi Patrick, > > > I'm afraid your ceph-post-file logs were lost to the nether. AFAICT, > > our ceph-post-file storage has been non-functional since the beginning > > of the lab outage last year. We're looking into it. > > I have it here still. Any other way I can send it to you? Nevermind, I found the machine it was stored on. It was a misconfiguration caused by post-lab-outage rebuilds. > > Extremely unlikely. > > Okay, taking your word for it. But something seems to be stalling > journal trimming. We had a similar thing yesterday evening, but at much > smaller scale without noticeable pool size increase. I only got an alert > that the ceph_mds_log_ev Prometheus metric starting going up again for a > single MDS. It grew past 1M events, so I restarted it. I also restarted > the other MDS and they all immediately jumped to above 5M events and > stayed there. They are, in fact, still there and have decreased only > very slightly in the morning. The pool size is totally within a normal > range, though, at 290GiB. Please keep monitoring it. I think you're not the only cluster to experience this. > > So clearly (a) an incredible number of journal events are being logged > > and (b) trimming is slow or unable to make progress. I'm looking into > > why but you can help by running the attached script when the problem > > is occurring so I can investigate. I'll need a tarball of the outputs. > > How do I send it to you if not via ceph-post-file? It should work soon next week. We're moving the drop.ceph.com service to a standalone VM soonish. > > Also, in the off-chance this is related to the MDS balancer, please > > disable it since you're using ephemeral pinning: > > > > ceph config set mds mds_bal_interval 0 > > Done. > > Thanks for your help! > Janek > > > -- > > Bauhaus-Universität Weimar > Bauhausstr. 9a, R308 > 99423 Weimar, Germany > > Phone: +49 3643 58 3577 > www.webis.de > -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx