On 2/1/22 05:49, Michal Hocko wrote:
On Mon 31-01-22 13:38:28, Waiman Long wrote:
[...]
Of course, it is also possible to have a debugfs interface to list those
dead memcg information, displaying more information about the page that pins
the memcg will be hard without using the page owner tool.
Yes, you will need page owner or hook into the kernel by other means
(like already mentioned drgn). The question is whether scanning all
existing pages to get that information is the best we can offer.
The page_owner tool records the page information at allocation time.
There are some slight performance overhead, but it is the memory
overhead that is the major drawback of this approach as we need one
page_owner structure for each physical page. Page scanning is only done
when users read the page_owner debugfs file. Yes, I agree that scanning
all the pages is not the most efficient way to get these dead memcg
information, but it is what the page_owner tool does. I would argue that
this is the most efficient coding-wise to get this information.
Keeping track of
the list of dead memcg's may also have some runtime overhead.
Could you be more specific? Offlined memcgs are still part of the
hierarchy IIRC. So it shouldn't be much more than iterating the whole
cgroup tree and collect interesting data about dead cgroups.
What I mean is that without piggybacking on top of page_owner, we will
to add a lot more code to collect and display those information which
may have some overhead of its own.
Cheers,
Longman