Re: MDS cache always increasing

Alexander Patrakov <patrakov@xxxxxxxxx> · Tue, 3 Sep 2024 10:01:52 +0800

MDS cannot release an inode if a client has cached it (and thus can
have newer data than OSDs have). The MDS needs to know at least which
client to ask if someone else requests the same file.

MDS does ask clients to release caps, but sometimes this doesn't work,
and there is no good troubleshooting guide except trying different
kernel versions and switching between kernel client / fuse / nfs.

On Mon, Sep 2, 2024 at 7:40 PM Sake Ceph <ceph@xxxxxxxxxxx> wrote:
>
> The folders contain a couple of million files, but are really static. We have another folder with a lot of updates and the MDS server for that folder has indeed a continuous increase of memory usage. But I would focus on the app2 and app4 folders, because those have a lot less changes in it.
> But why keeps the MDS al this information in its memory? If it isn't accessed for more than 20 hours, it should release it in my opinion (even a lot earlier, like after an hour).
>
> Kind regards,
> Sake
>
> > Op 02-09-2024 09:33 CEST schreef Eugen Block <eblock@xxxxxx>:
> >
> >
> > Can you tell if the number of objects increases in your cephfs between
> > those bursts? I noticed something similar in a 16.2.15 cluster as
> > well. It's not that heavily used, but it contains home directories and
> > development working directories etc. And when one user checked out a
> > git project, the mds memory usage increased a lot, getting near its
> > configured limit. Before there were around 3,7 Million objects in the
> > cephfs, that user added more than a million more files with his
> > checkout. It wasn't a real issue (yet) because the usage isn't very
> > dynamical and the total number of files is relatively stable.
> > This doesn't really help resolve anything, but if your total number of
> > files grows, I'm not surprised that the mds requires more memory.
> >
> > Zitat von Alexander Patrakov <patrakov@xxxxxxxxx>:
> >
> > > As a workaround, to reduce the impact of the MDS slowed down by
> > > excessive memory consumption, I would suggest installing earlyoom,
> > > disabling swap, and configuring earlyoom as follows (usually through
> > > /etc/sysconfig/earlyoom, but could be in a different place on your
> > > distribution):
> > >
> > > EARLYOOM_ARGS="-p -r 600 -m 4,4 -s 1,1"
> > >
> > > On Sat, Aug 31, 2024 at 3:44 PM Sake Ceph <ceph@xxxxxxxxxxx> wrote:
> > >>
> > >> Ow it got worse after the upgrade to Reef (was running Quincy).
> > >> With Quincy the memory usage was also a lot of times around 95% and
> > >> some swap usage, but never exceeding both to the point of crashing.
> > >>
> > >> Kind regards,
> > >> Sake
> > >> > Op 31-08-2024 09:15 CEST schreef Alexander Patrakov <patrakov@xxxxxxxxx>:
> > >> >
> > >> >
> > >> > Got it.
> > >> >
> > >> > However, to narrow down the issue, I suggest that you test whether it
> > >> > still exists after the following changes:
> > >> >
> > >> > 1. Reduce max_mds to 1.
> > >> > 2. Do not reduce max_mds to 1, but migrate all clients from a direct
> > >> > CephFS mount to NFS.
> > >> >
> > >> > On Sat, Aug 31, 2024 at 2:55 PM Sake Ceph <ceph@xxxxxxxxxxx> wrote:
> > >> > >
> > >> > > I was talking about the hosts where the MDS containers are
> > >> running on. The clients are all RHEL 9.
> > >> > >
> > >> > > Kind regards,
> > >> > > Sake
> > >> > >
> > >> > > > Op 31-08-2024 08:34 CEST schreef Alexander Patrakov
> > >> <patrakov@xxxxxxxxx>:
> > >> > > >
> > >> > > >
> > >> > > > Hello Sake,
> > >> > > >
> > >> > > > The combination of two active MDSs and RHEL8 does ring a bell, and I
> > >> > > > have seen this with Quincy, too. However, what's relevant is the
> > >> > > > kernel version on the clients. If they run the default 4.18.x kernel
> > >> > > > from RHEL8, please either upgrade to the mainline kernel or decrease
> > >> > > > max_mds to 1. If they run a modern kernel, then it is something I do
> > >> > > > not know about.
> > >> > > >
> > >> > > > On Sat, Aug 31, 2024 at 1:21 PM Sake Ceph <ceph@xxxxxxxxxxx> wrote:
> > >> > > > >
> > >> > > > > @Anthony: it's a small virtualized cluster and indeed SWAP
> > >> shouldn't be used, but this doesn't change the problem.
> > >> > > > >
> > >> > > > > @Alexander: the problem is in the active nodes, the standby
> > >> replay don't have issues anymore.
> > >> > > > >
> > >> > > > > Last night's backup run increased the memory usage to 86%
> > >> when rsync was running for app2. It dropped to 77,8% when it was
> > >> done. When the rsync for app4 was running it increased to 84% and
> > >> dropping to 80%. After a few hours it's now settled on 82%.
> > >> > > > > It looks to me the MDS server is caching something forever
> > >> while it isn't being used..
> > >> > > > >
> > >> > > > > The underlying host is running on RHEL 8. Upgrade to RHEL 9
> > >> is planned, but hit some issues with automatically upgrading hosts.
> > >> > > > >
> > >> > > > > Kind regards,
> > >> > > > > Sake
> > >> > > > > _______________________________________________
> > >> > > > > ceph-users mailing list -- ceph-users@xxxxxxx
> > >> > > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > --
> > >> > > > Alexander Patrakov
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > Alexander Patrakov
> > >> > _______________________________________________
> > >> > ceph-users mailing list -- ceph-users@xxxxxxx
> > >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > >> _______________________________________________
> > >> ceph-users mailing list -- ceph-users@xxxxxxx
> > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > >
> > >
> > >
> > > --
> > > Alexander Patrakov
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

-- 
Alexander Patrakov
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx