Re: kernfs memcg accounting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 10, 2022 at 08:06:24PM -0700, Roman Gushchin <roman.gushchin@xxxxxxxxx> wrote:
> My primary goal was to apply the memory pressure on memory cgroups with a lot
> of (dying) children cgroups. On a multi-cpu machine a memory cgroup structure
> is way larger than a page, so a cgroup which looks small can be really large
> if we calculate the amount of memory taken by all children memcg internals.
> 
> Applying this pressure to another cgroup (e.g. the one which contains systemd)
> doesn't help to reclaim any pages which are pinning the dying cgroups.

Just a note -- this another usecase of cgroups created from within the
subtree (e.g. a container). I agree that cgroup-manager/systemd case is
also valid (as dying memcgs may accumulate after a restart).

memcgs with their retained state with footprint are special.

> For other controllers (maybe blkcg aside, idk) it shouldn't matter, because
> there is no such problem there.
> 
> For consistency reasons I'd suggest to charge all *large* allocations
> (e.g. percpu) to the parent cgroup. Small allocations can be ignored.

Strictly speaking, this would mean that any controller would have on
implicit dependency on the memory controller (such as io controller
has).
In the extreme case even controller-less hierarchy would have such a
requirement (for precise kernfs_node accounting).
Such a dependency is not enforceable on v1 (with various topologies of
different hierarchies).
Although, I initially favored the consistency with memory controller too,
I think it's simpler to charge to the creator's memcg to achieve
consistency across v1 and v2 :-) 

Michal



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux