On Thu, Mar 10, 2022 at 3:41 PM Roman Gushchin <roman.gushchin@xxxxxxxxx> wrote: > > On Thu, Mar 10, 2022 at 12:21:44PM -0800, Yosry Ahmed wrote: > > Hi everyone, > > > > I was looking at the memcg_slabinfo.py drgn script that offers a > > replacement to the deprecated memory.kmem.slabinfo. I had some > > questions about how it collects the memcg slab stats: > > Hi Yosry! > > First, I have to admit that I haven't spent too much time optimizing > this script for speed. So, there are almost certainly available opportunities > to enhance it and patches are highly welcome. > > > > > 1. Why does the script loop through all struct pages on the system? > > Wouldn't it be more efficient to loop for every kmem_cache, for every > > online kmem_cache_node, then loop through slabs_free, slabs_full, and > > slabs_partial lists? > > It's somewhat tricky with SLUB because of per-cpu partial pages (I'm less > familiar with SLAB). In theory, we have a single-linked list of such pages, > but idk if we can reliably traverse it (given that it will be changed > concurrently). We also will be way more dependent on SLUB internals. > However it still might be a good optimization. > > > > > This seems more consistent with how /proc/slabinfo works, and more > > efficient. I tested this on SLAB using a crash script as I am unable > > to run drgn on my current setup. I am not sure how correct this would > > be for SLUB though. > > /proc/slabinfo has its own weaknesses, e.g. it shows systematically wrong > numbers for slab utilization because of how it handles per-cpu partial pages > (on SLUB). > Honestly I haven't looked into this for SLUB, but it seems like it is a valid optimization for SLAB. At least it is equivalent to /proc/slabinfo which I assume is somehow accurate for SLAB. > > > > 2. Before looping through pages, why does the script collect all > > objcgs belonging to the desired memcg in a set, and then test every > > objcg in a slab page to see whether it belongs to that memcg. Wouldn't > > it be easier to just check objcg->memcg? AFAICT this gets updated as > > well when the objcg is reparented. > > I can't think of any good reason now, however it's not obviously faster > (I guess dereferencing of a pointer in drgn can be more expensive than doing > few "local" comparison, something to measure). > If it is faster, it will be a good enhancement. > You are right (at least for crash) it is more expensive to dereference the obj_cgroup pointers! > > > > Sorry for my ignorance if any of the assumptions I made are incorrect. > > I just wanted to get more understanding of the implementation > > decisions taken while writing the script. > > You're welcome!