Re: CephFS metadata pool grows by two orders of magnitude while trimming (?) snapshots

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 9, 2023 at 3:27 AM Janek Bevendorff
<janek.bevendorff@xxxxxxxxxxxxx> wrote:
>
> Hi Patrick,
>
> > I'm afraid your ceph-post-file logs were lost to the nether. AFAICT,
> > our ceph-post-file storage has been non-functional since the beginning
> > of the lab outage last year. We're looking into it.
>
> I have it here still. Any other way I can send it to you?

Nevermind, I found the machine it was stored on. It was a
misconfiguration caused by post-lab-outage rebuilds.

> > Extremely unlikely.
>
> Okay, taking your word for it. But something seems to be stalling
> journal trimming. We had a similar thing yesterday evening, but at much
> smaller scale without noticeable pool size increase. I only got an alert
> that the ceph_mds_log_ev Prometheus metric starting going up again for a
> single MDS. It grew past 1M events, so I restarted it. I also restarted
> the other MDS and they all immediately jumped to above 5M events and
> stayed there. They are, in fact, still there and have decreased only
> very slightly in the morning. The pool size is totally within a normal
> range, though, at 290GiB.

Please keep monitoring it. I think you're not the only cluster to
experience this.

> > So clearly (a) an incredible number of journal events are being logged
> > and (b) trimming is slow or unable to make progress. I'm looking into
> > why but you can help by running the attached script when the problem
> > is occurring so I can investigate. I'll need a tarball of the outputs.
>
> How do I send it to you if not via ceph-post-file?

It should work soon next week. We're moving the drop.ceph.com service
to a standalone VM soonish.

> > Also, in the off-chance this is related to the MDS balancer, please
> > disable it since you're using ephemeral pinning:
> >
> > ceph config set mds mds_bal_interval 0
>
> Done.
>
> Thanks for your help!
> Janek
>
>
> --
>
> Bauhaus-Universität Weimar
> Bauhausstr. 9a, R308
> 99423 Weimar, Germany
>
> Phone: +49 3643 58 3577
> www.webis.de
>


-- 
Patrick Donnelly, Ph.D.
He / Him / His
Red Hat Partner Engineer
IBM, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux