Re: A path forward to cleaning up dying cgroups?

Kairui Song <ryncsn@xxxxxxxxx> · Thu, 6 Feb 2025 12:56:03 +0800

On Thu, Feb 6, 2025 at 2:16 AM Yosry Ahmed <yosry.ahmed@xxxxxxxxx> wrote:
>
> On Wed, Feb 05, 2025 at 01:08:42PM -0500, Johannes Weiner wrote:
> > On Wed, Feb 05, 2025 at 12:50:19PM -0500, Hamza Mahfooz wrote:
> > > Cc: Shakeel Butt <shakeel.butt@xxxxxxxxx>
> > >
> > > On 2/5/25 12:48, Hamza Mahfooz wrote:
> > > > I was just curious as to what the status of the issue described in [1]
> > > > is. It appears that the last time someone took a stab at it was in [2].
> >
> > If memory serves, the sticking point was whether pages should indeed
> > be reparented on cgroup death, or whether they could be moved
> > arbitrarily to other cgroups that are still using them.
> >
> > It's a bit unfortunate, because the reparenting patches were tested
> > and reviewed, and the arbitrary recharging was just an idea that
> > ttbomk nobody seriously followed up on afterwards.
>
> There was an RFC series [1] for the recharging, but all memcg
> maintainers hated it :P
>
> https://lore.kernel.org/lkml/20230720070825.992023-1-yosryahmed@xxxxxxxxxx/

We have been suffering from dying cgroup issues for years too, and I
just saw this series. Will it be a good idea to combine this with
reparenting instead (if we will go with the reparenting approach)?
Using objcg API to charge the folios does help speed up the
reparenting, but also adds some overhead and complexity. Just walking
and reparenting the folios seems a more direct approach.

And another idea is, per our observation, dying cgroups have few pages
that are mapped, as the process has all exited. Most folios are just
cache. Shared mapped pages are minor especially for containers. So a
deferred recharge on access seems good enough? Mapped folios may also
be finally unmap someday and get recharged. And at least this makes
accounting more accurate.