On Wed, Apr 20, 2022 at 03:24:49PM -0700, Yang Shi wrote: > On Fri, Apr 15, 2022 at 5:28 PM Roman Gushchin <roman.gushchin@xxxxxxxxx> wrote: > > > > There are 50+ different shrinkers in the kernel, many with their own bells and > > whistles. Under the memory pressure the kernel applies some pressure on each of > > them in the order of which they were created/registered in the system. Some > > of them can contain only few objects, some can be quite large. Some can be > > effective at reclaiming memory, some not. > > > > The only existing debugging mechanism is a couple of tracepoints in > > do_shrink_slab(): mm_shrink_slab_start and mm_shrink_slab_end. They aren't > > covering everything though: shrinkers which report 0 objects will never show up, > > there is no support for memcg-aware shrinkers. Shrinkers are identified by their > > scan function, which is not always enough (e.g. hard to guess which super > > block's shrinker it is having only "super_cache_scan"). They are a passive > > mechanism: there is no way to call into counting and scanning of an individual > > shrinker and profile it. > > > > To provide a better visibility and debug options for memory shrinkers > > this patchset introduces a /sys/kernel/shrinker interface, to some extent > > similar to /sys/kernel/slab. > > > > For each shrinker registered in the system a folder is created. The folder > > contains "count" and "scan" files, which allow to trigger count_objects() > > and scan_objects() callbacks. For memcg-aware and numa-aware shrinkers > > count_memcg, scan_memcg, count_node, scan_node, count_memcg_node > > and scan_memcg_node are additionally provided. They allow to get per-memcg > > and/or per-node object count and shrink only a specific memcg/node. > > > > To make debugging more pleasant, the patchset also names all shrinkers, > > so that sysfs entries can have more meaningful names. > > > > Usage examples: > > Thanks, Roman. A follow-up question, why do we have to implement this > in kernel if we just count the objects? It seems userspace tools could > achieve it too, for example, drgn :-). Actually I did write a drgn > script for debugging a problem a few months ago, which iterates > specific memcg's lru_list to count the objects by their state. Good question! It's because not all shrinkers are lru_list-based and even some lru_list-based are implementing a custom logic on top of it, e.g. shadow nodes. So there is no simple way to get the count from a generic shrinker. Also I want to be able to reclaim individual shrinkers from userspace (e.g. to profile how effective the shrinking is). Thanks!