On 7/6/24 10:55 PM, David Rientjes wrote: > Hi all, > > I'm trying to crowdsource information on open source tools that can be > used directly by customers to explain memory mappings, usage, pressure, > etc. > > We encounter both internal and external users that are looking for this > insight and it often requires significant engineering time to collect data > to make any conclusions. > > A recent example is an external customer that recently upgraded their > userspace and started to run into memcg constrained memory pressure that > wasn't previously observed. After handing off a hacky script to run in > the background, it was immediately obvious that the source of the direct > reclaim was all of the MADV_FREE memory that was sitting around. > Converting to MADV_DONTNEED solved their issue. BTW, was this reported/fixed upstream? Sounds like a bug to me that would better be fixed than suggesting the MADV_DONTNEED workaround to everyone from now on. > A month ago, a different external customer was concerned about increased > memory access latency in their guest on some instances although there > were no issues observable on the host. After handing off a hacky script > to run in the background, it was immediately obvious that memory > fragmentation was resulting in a large disparity in the number of > hugepages that were available on some instances. > > Rather than hacky scripts that collect things like vmstat, memory.stat, > buddyinfo, etc, at regular intervals, it would be preferable to hand off > something more complete. Idea is an open source tool that can be run in > the background to collect metrics for the system, NUMA nodes, and memcg > hierarchies, as well as potentially from subsystems in the kernel like > delay accounting. IOW, I want to be able to say "install ${tool} and send > over the log file." > > Are thre any open source tools that do a good job of this today that I can > latch onto? If not, sounds like I'll be writing one from scratch. Let me > know if there's interest in this as well. > > Thanks! >