Hi Roman, that seems to be working. I've kicked off a test collecting memleak info with a 10 min interval. Here is example output it is showing: [16:03:07] Top 10 stacks with outstanding allocations: 1256 bytes in 157 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu_gfp+0x12 [kernel] fib6_info_alloc+0x47 [kernel] ip6_route_info_create+0x25d [kernel] ip6_route_add+0x1d [kernel] addrconf_prefix_route+0x10d [kernel] addrconf_add_linklocal+0x9a [kernel] addrconf_addr_gen+0xfe [kernel] addrconf_dev_config+0x9f [kernel] addrconf_notify+0x104 [kernel] notifier_call_chain+0x4c [kernel] raw_notifier_call_chain+0x16 [kernel] call_netdevice_notifiers_info+0x2d [kernel] __dev_notify_flags+0x65 [kernel] dev_change_flags+0x52 [kernel] do_setlink+0x2eb [kernel] rtnl_newlink+0x51e [kernel] rtnetlink_rcv_msg+0x27c [kernel] netlink_rcv_skb+0x54 [kernel] rtnetlink_rcv+0x15 [kernel] netlink_unicast+0x1ab [kernel] netlink_sendmsg+0x2d1 [kernel] sock_sendmsg+0x3e [kernel] __sys_sendto+0x13f [kernel] __x64_sys_sendto+0x28 [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] 1256 bytes in 157 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu_gfp+0x12 [kernel] fib6_info_alloc+0x47 [kernel] ip6_route_info_create+0x25d [kernel] ip6_route_add+0x1d [kernel] addrconf_add_mroute+0xb5 [kernel] addrconf_add_dev+0x4f [kernel] addrconf_dev_config+0x83 [kernel] addrconf_notify+0x104 [kernel] notifier_call_chain+0x4c [kernel] raw_notifier_call_chain+0x16 [kernel] call_netdevice_notifiers_info+0x2d [kernel] __dev_notify_flags+0x65 [kernel] dev_change_flags+0x52 [kernel] do_setlink+0x2eb [kernel] rtnl_newlink+0x51e [kernel] rtnetlink_rcv_msg+0x27c [kernel] netlink_rcv_skb+0x54 [kernel] rtnetlink_rcv+0x15 [kernel] netlink_unicast+0x1ab [kernel] netlink_sendmsg+0x2d1 [kernel] sock_sendmsg+0x3e [kernel] __sys_sendto+0x13f [kernel] __x64_sys_sendto+0x28 [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] 1260 bytes in 315 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu_gfp+0x12 [kernel] __percpu_counter_init+0x23 [kernel] fprop_global_init+0x22 [kernel] wb_domain_init+0x6e [kernel] mem_cgroup_css_alloc+0x19c [kernel] cgroup_apply_control_enable+0x13e [kernel] cgroup_mkdir+0x22b [kernel] kernfs_iop_mkdir+0x5f [kernel] vfs_mkdir+0x108 [kernel] do_mkdirat+0xe8 [kernel] __x64_sys_mkdirat+0x1a [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] 1400 bytes in 175 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu_gfp+0x12 [kernel] fib6_info_alloc+0x47 [kernel] addrconf_f6i_alloc+0x36 [kernel] ipv6_add_addr+0x120 [kernel] add_addr+0x73 [kernel] addrconf_notify+0x50f [kernel] notifier_call_chain+0x4c [kernel] raw_notifier_call_chain+0x16 [kernel] call_netdevice_notifiers_info+0x2d [kernel] __dev_notify_flags+0x65 [kernel] dev_change_flags+0x52 [kernel] do_setlink+0x2eb [kernel] rtnl_newlink+0x51e [kernel] rtnetlink_rcv_msg+0x27c [kernel] netlink_rcv_skb+0x54 [kernel] rtnetlink_rcv+0x15 [kernel] netlink_unicast+0x1ab [kernel] netlink_sendmsg+0x2d1 [kernel] sock_sendmsg+0x3e [kernel] __sys_sendto+0x13f [kernel] __x64_sys_sendto+0x28 [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] 2520 bytes in 315 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu_gfp+0x12 [kernel] percpu_ref_init+0x23 [kernel] cgroup_apply_control_enable+0x183 [kernel] cgroup_mkdir+0x22b [kernel] kernfs_iop_mkdir+0x5f [kernel] vfs_mkdir+0x108 [kernel] do_mkdirat+0xe8 [kernel] __x64_sys_mkdirat+0x1a [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] 2520 bytes in 315 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu_gfp+0x12 [kernel] percpu_ref_init+0x23 [kernel] cgroup_mkdir+0x106 [kernel] kernfs_iop_mkdir+0x5f [kernel] vfs_mkdir+0x108 [kernel] do_mkdirat+0xe8 [kernel] __x64_sys_mkdirat+0x1a [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] 70560 bytes in 315 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu+0x15 [kernel] mem_cgroup_css_alloc+0xa7 [kernel] cgroup_apply_control_enable+0x13e [kernel] cgroup_mkdir+0x22b [kernel] kernfs_iop_mkdir+0x5f [kernel] vfs_mkdir+0x108 [kernel] do_mkdirat+0xe8 [kernel] __x64_sys_mkdirat+0x1a [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] 184640 bytes in 5770 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu+0x15 [kernel] __kmem_cache_create+0x3dc [kernel] create_cache+0xd2 [kernel] memcg_create_kmem_cache+0x109 [kernel] memcg_kmem_cache_create_func+0x20 [kernel] process_one_work+0x1fd [kernel] worker_thread+0x34 [kernel] kthread+0x121 [kernel] ret_from_fork+0x35 [kernel] 309960 bytes in 315 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu+0x15 [kernel] mem_cgroup_css_alloc+0x75 [kernel] cgroup_apply_control_enable+0x13e [kernel] cgroup_mkdir+0x22b [kernel] kernfs_iop_mkdir+0x5f [kernel] vfs_mkdir+0x108 [kernel] do_mkdirat+0xe8 [kernel] __x64_sys_mkdirat+0x1a [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] 1015808 bytes in 248 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu+0x15 [kernel] xt_percpu_counter_alloc+0x68 [kernel] find_check_entry.isra.7+0x55 [kernel] translate_table+0x475 [kernel] do_ipt_set_ctl+0x11a [kernel] nf_setsockopt+0x4c [kernel] ip_setsockopt+0x8a [kernel] raw_setsockopt+0x34 [kernel] sock_common_setsockopt+0x1a [kernel] __sys_setsockopt+0x86 [kernel] __x64_sys_setsockopt+0x24 [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] thanks, Dan McGinnes IBM Cloud - Containers performance Int Tel: 247359 Ext Tel: 01962 817359 Notes: Daniel McGinnes/UK/IBM Email: MCGINNES@xxxxxxxxxx IBM (UK) Ltd, Hursley Park,Winchester,Hampshire, SO21 2JN From: Roman Gushchin <guro@xxxxxx> To: Daniel McGinnes <MCGINNES@xxxxxxxxxx> Cc: "cgroups@xxxxxxxxxxxxxxx" <cgroups@xxxxxxxxxxxxxxx>, Nathaniel Rockwell <nrockwell@xxxxxxxxxx> Date: 27/09/2018 15:29 Subject: Re: PROBLEM: Memory leaking when running kubernetes cronjobs On Wed, Sep 26, 2018 at 04:46:57PM +0100, Daniel McGinnes wrote: > Hi Roman, > > I have original memleak.py working, let me know when you have the modified > version ready. Thanks again for all the help with this! Hi Daniel! I think you only need to change the list of kernel tracepoints (defined in bpf_source_kernel) to track only per-cpu allocations: bpf_source_kernel = """ TRACEPOINT_PROBE(percpu, percpu_alloc_percpu) { gen_alloc_enter((struct pt_regs *)args, args->size); return gen_alloc_exit2((struct pt_regs *)args, (size_t)args->ptr); } TRACEPOINT_PROBE(percpu, percpu_free_percpu) { return gen_free_enter((struct pt_regs *)args, (void *)args->ptr); } """ I've copy-pasted it from some of my hacky dev scripts, not sure if this one works, apply on your own risk. Thanks! Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU