On Thu, Mar 10, 2022 at 12:21:44PM -0800, Yosry Ahmed wrote: > Hi everyone, > > I was looking at the memcg_slabinfo.py drgn script that offers a > replacement to the deprecated memory.kmem.slabinfo. I had some > questions about how it collects the memcg slab stats: Hi Yosry! First, I have to admit that I haven't spent too much time optimizing this script for speed. So, there are almost certainly available opportunities to enhance it and patches are highly welcome. > > 1. Why does the script loop through all struct pages on the system? > Wouldn't it be more efficient to loop for every kmem_cache, for every > online kmem_cache_node, then loop through slabs_free, slabs_full, and > slabs_partial lists? It's somewhat tricky with SLUB because of per-cpu partial pages (I'm less familiar with SLAB). In theory, we have a single-linked list of such pages, but idk if we can reliably traverse it (given that it will be changed concurrently). We also will be way more dependent on SLUB internals. However it still might be a good optimization. > > This seems more consistent with how /proc/slabinfo works, and more > efficient. I tested this on SLAB using a crash script as I am unable > to run drgn on my current setup. I am not sure how correct this would > be for SLUB though. /proc/slabinfo has its own weaknesses, e.g. it shows systematically wrong numbers for slab utilization because of how it handles per-cpu partial pages (on SLUB). > > 2. Before looping through pages, why does the script collect all > objcgs belonging to the desired memcg in a set, and then test every > objcg in a slab page to see whether it belongs to that memcg. Wouldn't > it be easier to just check objcg->memcg? AFAICT this gets updated as > well when the objcg is reparented. I can't think of any good reason now, however it's not obviously faster (I guess dereferencing of a pointer in drgn can be more expensive than doing few "local" comparison, something to measure). If it is faster, it will be a good enhancement. > > Sorry for my ignorance if any of the assumptions I made are incorrect. > I just wanted to get more understanding of the implementation > decisions taken while writing the script. You're welcome!