On Wed, Feb 02, 2022 at 05:38:07PM +0100, Michal Hocko wrote: > On Wed 02-02-22 07:54:48, Roman Gushchin wrote: > > On Wed, Feb 02, 2022 at 09:57:18AM +0100, Michal Hocko wrote: > > > On Tue 01-02-22 11:41:19, Waiman Long wrote: > > > > > > > > On 2/1/22 05:49, Michal Hocko wrote: > > > [...] > > > > > Could you be more specific? Offlined memcgs are still part of the > > > > > hierarchy IIRC. So it shouldn't be much more than iterating the whole > > > > > cgroup tree and collect interesting data about dead cgroups. > > > > > > > > What I mean is that without piggybacking on top of page_owner, we will to > > > > add a lot more code to collect and display those information which may have > > > > some overhead of its own. > > > > > > Yes, there is nothing like a free lunch. Page owner is certainly a tool > > > that can be used. My main concern is that this tool doesn't really > > > scale on large machines with a lots of memory. It will provide a very > > > detailed information but I am not sure this is particularly helpful to > > > most admins (why should people process tons of allocation backtraces in > > > the first place). Wouldn't it be sufficient to have per dead memcg stats > > > to see where the memory sits? > > > > > > Accumulated offline memcgs is something that bothers more people and I > > > am really wondering whether we can do more for those people to evaluate > > > the current state. > > > > Cgroup v2 has corresponding counters for years. Or do you mean something different? > > Do we have anything more specific than nr_dying_descendants? No, just nr_dying_descendants. > I was thinking about an interface which would provide paths and stats for dead > memcgs. But I have to confess I haven't really spent much time thinking > about how much work that would be. I am by no means against adding memcg > information to the page owner. I just think there must be a better way > to present resource consumption by dead memcgs. I'd go with a drgn script. I wrote a bunch of them some times ago and can probably revive them and post here (will take few days). I agree that the problem still exists and providing some tool around would be useful. Thanks!