Re: Help needed, ceph fs down due to large stray dir

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



mds_beacon_grace is, perhaps confusingly, not an MDS configuration. It's
applied to MONs. As you've injected it into the MDS that is likely why the
heartbeat is still failing:

This has the effect of having the MDS continue to send beacons to the
monitors
even when its internal "heartbeat" mechanism has not been reset (beat) in
one
hour. Note the previous mechanism for achieving this was via the `
mds_beacon_grace` monitor setting.

On Fri, Jan 10, 2025 at 1:30 PM Frank Schilder <frans@xxxxxx> wrote:

> Hi all,
>
> we seem to have a serious issue with our file system, ceph version is
> pacific latest. After a large cleanup operation we had an MDS rank with
> 100Mio stray entries (yes, one hundred million). Today we restarted this
> daemon, which cleans up the stray entries. It seems that this leads to a
> restart loop due to OOM. The rank becomes active and then starts pulling in
> DNS and INOS entries until all memory is exhausted.
>
> I have no idea if there is at least progress removing the stray items or
> if it starts from scratch every time. If it needs to pull as many DNS/INOS
> into cache as there are stray items, we don't have a server at hand with
> enough RAM.
>
> Q1: Is the MDS at least making progress in every restart iteration?
> Q2: If not, how do we get this rank up again?
> Q3: If we can't get this rank up soon, can we at least move directories
> away from this rank by pinning it to another rank?
>
> Currently, the rank in question reports .mds_cache.num_strays=0 in perf
> dump.
>
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux