Hi all, I'm trying to crowdsource information on open source tools that can be used directly by customers to explain memory mappings, usage, pressure, etc. We encounter both internal and external users that are looking for this insight and it often requires significant engineering time to collect data to make any conclusions. A recent example is an external customer that recently upgraded their userspace and started to run into memcg constrained memory pressure that wasn't previously observed. After handing off a hacky script to run in the background, it was immediately obvious that the source of the direct reclaim was all of the MADV_FREE memory that was sitting around. Converting to MADV_DONTNEED solved their issue. A month ago, a different external customer was concerned about increased memory access latency in their guest on some instances although there were no issues observable on the host. After handing off a hacky script to run in the background, it was immediately obvious that memory fragmentation was resulting in a large disparity in the number of hugepages that were available on some instances. Rather than hacky scripts that collect things like vmstat, memory.stat, buddyinfo, etc, at regular intervals, it would be preferable to hand off something more complete. Idea is an open source tool that can be run in the background to collect metrics for the system, NUMA nodes, and memcg hierarchies, as well as potentially from subsystems in the kernel like delay accounting. IOW, I want to be able to say "install ${tool} and send over the log file." Are thre any open source tools that do a good job of this today that I can latch onto? If not, sounds like I'll be writing one from scratch. Let me know if there's interest in this as well. Thanks!