On Fri, Feb 16, 2024 at 8:57 AM Vlastimil Babka <vbabka@xxxxxxx> wrote: > > On 2/12/24 22:38, Suren Baghdasaryan wrote: > > Introduce CONFIG_MEM_ALLOC_PROFILING which provides definitions to easily > > instrument memory allocators. It registers an "alloc_tags" codetag type > > with /proc/allocinfo interface to output allocation tag information when > > the feature is enabled. > > CONFIG_MEM_ALLOC_PROFILING_DEBUG is provided for debugging the memory > > allocation profiling instrumentation. > > Memory allocation profiling can be enabled or disabled at runtime using > > /proc/sys/vm/mem_profiling sysctl when CONFIG_MEM_ALLOC_PROFILING_DEBUG=n. > > CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT enables memory allocation > > profiling by default. > > > > Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx> > > Co-developed-by: Kent Overstreet <kent.overstreet@xxxxxxxxx> > > Signed-off-by: Kent Overstreet <kent.overstreet@xxxxxxxxx> > > --- > > Documentation/admin-guide/sysctl/vm.rst | 16 +++ > > Documentation/filesystems/proc.rst | 28 +++++ > > include/asm-generic/codetag.lds.h | 14 +++ > > include/asm-generic/vmlinux.lds.h | 3 + > > include/linux/alloc_tag.h | 133 ++++++++++++++++++++ > > include/linux/sched.h | 24 ++++ > > lib/Kconfig.debug | 25 ++++ > > lib/Makefile | 2 + > > lib/alloc_tag.c | 158 ++++++++++++++++++++++++ > > scripts/module.lds.S | 7 ++ > > 10 files changed, 410 insertions(+) > > create mode 100644 include/asm-generic/codetag.lds.h > > create mode 100644 include/linux/alloc_tag.h > > create mode 100644 lib/alloc_tag.c > > > > diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst > > index c59889de122b..a214719492ea 100644 > > --- a/Documentation/admin-guide/sysctl/vm.rst > > +++ b/Documentation/admin-guide/sysctl/vm.rst > > @@ -43,6 +43,7 @@ Currently, these files are in /proc/sys/vm: > > - legacy_va_layout > > - lowmem_reserve_ratio > > - max_map_count > > +- mem_profiling (only if CONFIG_MEM_ALLOC_PROFILING=y) > > - memory_failure_early_kill > > - memory_failure_recovery > > - min_free_kbytes > > @@ -425,6 +426,21 @@ e.g., up to one or two maps per allocation. > > The default value is 65530. > > > > > > +mem_profiling > > +============== > > + > > +Enable memory profiling (when CONFIG_MEM_ALLOC_PROFILING=y) > > + > > +1: Enable memory profiling. > > + > > +0: Disabld memory profiling. > > Disable Ack. > > ... > > > +allocinfo > > +~~~~~~~ > > + > > +Provides information about memory allocations at all locations in the code > > +base. Each allocation in the code is identified by its source file, line > > +number, module and the function calling the allocation. The number of bytes > > +allocated at each location is reported. > > See, it even says "number of bytes" :) Yes, we are changing the output to bytes. > > > + > > +Example output. > > + > > +:: > > + > > + > cat /proc/allocinfo > > + > > + 153MiB mm/slub.c:1826 module:slub func:alloc_slab_page > > Is "module" meant in the usual kernel module sense? In that case IIRC is > more common to annotate things e.g. [xfs] in case it's really a module, and > nothing if it's built it, such as slub. Is that "slub" simply derived from > "mm/slub.c"? Then it's just redundant? Sounds good. The new example would look like this: > sort -rn /proc/allocinfo 127664128 31168 mm/page_ext.c:270 func:alloc_page_ext 56373248 4737 mm/slub.c:2259 func:alloc_slab_page 14880768 3633 mm/readahead.c:247 func:page_cache_ra_unbounded 14417920 3520 mm/mm_init.c:2530 func:alloc_large_system_hash 13377536 234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs 11718656 2861 mm/filemap.c:1919 func:__filemap_get_folio 9192960 2800 kernel/fork.c:307 func:alloc_thread_stack_node 4206592 4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable 4136960 1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start 3940352 962 mm/memory.c:4214 func:alloc_anon_folio 2894464 22613 fs/kernfs/dir.c:615 func:__kernfs_new_node ... Note that [ctagmod] is the only allocation from a module in this example. > > > + 6.08MiB mm/slab_common.c:950 module:slab_common func:_kmalloc_order > > + 5.09MiB mm/memcontrol.c:2814 module:memcontrol func:alloc_slab_obj_exts > > + 4.54MiB mm/page_alloc.c:5777 module:page_alloc func:alloc_pages_exact > > + 1.32MiB include/asm-generic/pgalloc.h:63 module:pgtable func:__pte_alloc_one > > + 1.16MiB fs/xfs/xfs_log_priv.h:700 module:xfs func:xlog_kvmalloc > > + 1.00MiB mm/swap_cgroup.c:48 module:swap_cgroup func:swap_cgroup_prepare > > + 734KiB fs/xfs/kmem.c:20 module:xfs func:kmem_alloc > > + 640KiB kernel/rcu/tree.c:3184 module:tree func:fill_page_cache_func > > + 640KiB drivers/char/virtio_console.c:452 module:virtio_console func:alloc_buf > > + ... > > + > > + > > meminfo > > ... > > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > > index 0be2d00c3696..78d258ca508f 100644 > > --- a/lib/Kconfig.debug > > +++ b/lib/Kconfig.debug > > @@ -972,6 +972,31 @@ config CODE_TAGGING > > bool > > select KALLSYMS > > > > +config MEM_ALLOC_PROFILING > > + bool "Enable memory allocation profiling" > > + default n > > + depends on PROC_FS > > + depends on !DEBUG_FORCE_WEAK_PER_CPU > > + select CODE_TAGGING > > + help > > + Track allocation source code and record total allocation size > > + initiated at that code location. The mechanism can be used to track > > + memory leaks with a low performance and memory impact. > > + > > +config MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT > > + bool "Enable memory allocation profiling by default" > > + default y > > I'd go with default n as that I'd select for a general distro. Well, we have MEM_ALLOC_PROFILING=n by default, so if it was switched on manually, that is a strong sign that the user wants it enabled IMO. So, enabling this switch by default seems logical to me. If a distro wants to have the feature compiled in but disabled by default then this is perfectly doable, just need to set both options appropriately. Does my logic make sense? > > > + depends on MEM_ALLOC_PROFILING > > + > > +config MEM_ALLOC_PROFILING_DEBUG > > + bool "Memory allocation profiler debugging" > > + default n > > + depends on MEM_ALLOC_PROFILING > > + select MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT > > + help > > + Adds warnings with helpful error messages for memory allocation > > + profiling. > > + >