Hi Roman, I have set the config option, rebuilt and installed. Where can I find the percpu allocator statistics? I'velooked in /proc and can't see anything obvious.. thanks, Dan McGinnes IBM Cloud - Containers performance Int Tel: 247359 Ext Tel: 01962 817359 Notes: Daniel McGinnes/UK/IBM Email: MCGINNES@xxxxxxxxxx IBM (UK) Ltd, Hursley Park,Winchester,Hampshire, SO21 2JN From: Roman Gushchin <guro@xxxxxx> To: Daniel McGinnes <MCGINNES@xxxxxxxxxx> Cc: "cgroups@xxxxxxxxxxxxxxx" <cgroups@xxxxxxxxxxxxxxx>, Nathaniel Rockwell <nrockwell@xxxxxxxxxx> Date: 01/10/2018 12:14 Subject: Re: PROBLEM: Memory leaking when running kubernetes cronjobs Hi Daniel! Can you, please, rebuild your kernel with the CONFIG_PERCPU_STATS config option, and post percpu allocator statistics after the test? Sorry, I missed this option earlier. Thanks! From: Daniel McGinnes <MCGINNES@xxxxxxxxxx> Sent: Monday, October 1, 2018 4:00:55 AM To: Roman Gushchin Cc: cgroups@xxxxxxxxxxxxxxx; Nathaniel Rockwell Subject: Re: PROBLEM: Memory leaking when running kubernetes cronjobs Hi Roman, I let the test run over the weekend and Percpu memory has grown to 1.7GB Here is the latest output from the modified memleak tool (after I did drop_caches): [09:56:31] Top 10 stacks with outstanding allocations: 17232 bytes in 2154 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu_gfp+0x12 [kernel] fib6_info_alloc+0x47 [kernel] ip6_route_info_create+0x25d [kernel] ip6_route_add+0x1d [kernel] addrconf_add_mroute+0xb5 [kernel] addrconf_add_dev+0x4f [kernel] addrconf_dev_config+0x83 [kernel] addrconf_notify+0x104 [kernel] notifier_call_chain+0x4c [kernel] raw_notifier_call_chain+0x16 [kernel] call_netdevice_notifiers_info+0x2d [kernel] netdev_state_change+0x5a [kernel] linkwatch_do_dev+0x38 [kernel] __linkwatch_run_queue+0x10a [kernel] linkwatch_event+0x25 [kernel] process_one_work+0x1fd [kernel] worker_thread+0x34 [kernel] kthread+0x121 [kernel] ret_from_fork+0x35 [kernel] 28536 bytes in 29 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu+0x15 [kernel] mem_cgroup_css_alloc+0x75 [kernel] cgroup_apply_control_enable+0x13e [kernel] cgroup_mkdir+0x22b [kernel] kernfs_iop_mkdir+0x5f [kernel] vfs_mkdir+0x108 [kernel] do_mkdirat+0xe8 [kernel] __x64_sys_mkdir+0x1b [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] 59584 bytes in 266 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu+0x15 [kernel] mem_cgroup_css_alloc+0xa7 [kernel] cgroup_apply_control_enable+0x13e [kernel] cgroup_mkdir+0x22b [kernel] kernfs_iop_mkdir+0x5f [kernel] vfs_mkdir+0x108 [kernel] do_mkdirat+0xe8 [kernel] __x64_sys_mkdirat+0x1a [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] 139232 bytes in 4351 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu+0x15 [kernel] __kmem_cache_create+0x3dc [kernel] create_cache+0xd2 [kernel] memcg_create_kmem_cache+0x109 [kernel] memcg_kmem_cache_create_func+0x20 [kernel] process_one_work+0x1fd [kernel] worker_thread+0x34 [kernel] kthread+0x121 [kernel] ret_from_fork+0x35 [kernel] 261744 bytes in 266 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu+0x15 [kernel] mem_cgroup_css_alloc+0x75 [kernel] cgroup_apply_control_enable+0x13e [kernel] cgroup_mkdir+0x22b [kernel] kernfs_iop_mkdir+0x5f [kernel] vfs_mkdir+0x108 [kernel] do_mkdirat+0xe8 [kernel] __x64_sys_mkdirat+0x1a [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] 279016 bytes in 34877 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu_gfp+0x12 [kernel] fib6_info_alloc+0x47 [kernel] addrconf_f6i_alloc+0x36 [kernel] ipv6_add_addr+0x120 [kernel] addrconf_add_linklocal+0x73 [kernel] addrconf_addr_gen+0xfe [kernel] addrconf_dev_config+0x9f [kernel] addrconf_notify+0x104 [kernel] notifier_call_chain+0x4c [kernel] raw_notifier_call_chain+0x16 [kernel] call_netdevice_notifiers_info+0x2d [kernel] __dev_notify_flags+0x65 [kernel] dev_change_flags+0x52 [kernel] do_setlink+0x2eb [kernel] rtnl_newlink+0x51e [kernel] rtnetlink_rcv_msg+0x27c [kernel] netlink_rcv_skb+0x54 [kernel] rtnetlink_rcv+0x15 [kernel] netlink_unicast+0x1ab [kernel] netlink_sendmsg+0x2d1 [kernel] sock_sendmsg+0x3e [kernel] __sys_sendto+0x13f [kernel] __x64_sys_sendto+0x28 [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] 279016 bytes in 34877 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu_gfp+0x12 [kernel] fib6_info_alloc+0x47 [kernel] ip6_route_info_create+0x25d [kernel] ip6_route_add+0x1d [kernel] addrconf_add_mroute+0xb5 [kernel] addrconf_add_dev+0x4f [kernel] addrconf_dev_config+0x83 [kernel] addrconf_notify+0x104 [kernel] notifier_call_chain+0x4c [kernel] raw_notifier_call_chain+0x16 [kernel] call_netdevice_notifiers_info+0x2d [kernel] __dev_notify_flags+0x65 [kernel] dev_change_flags+0x52 [kernel] do_setlink+0x2eb [kernel] rtnl_newlink+0x51e [kernel] rtnetlink_rcv_msg+0x27c [kernel] netlink_rcv_skb+0x54 [kernel] rtnetlink_rcv+0x15 [kernel] netlink_unicast+0x1ab [kernel] netlink_sendmsg+0x2d1 [kernel] sock_sendmsg+0x3e [kernel] __sys_sendto+0x13f [kernel] __x64_sys_sendto+0x28 [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] 279016 bytes in 34877 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu_gfp+0x12 [kernel] fib6_info_alloc+0x47 [kernel] ip6_route_info_create+0x25d [kernel] ip6_route_add+0x1d [kernel] addrconf_prefix_route+0x10d [kernel] addrconf_add_linklocal+0x9a [kernel] addrconf_addr_gen+0xfe [kernel] addrconf_dev_config+0x9f [kernel] addrconf_notify+0x104 [kernel] notifier_call_chain+0x4c [kernel] raw_notifier_call_chain+0x16 [kernel] call_netdevice_notifiers_info+0x2d [kernel] __dev_notify_flags+0x65 [kernel] dev_change_flags+0x52 [kernel] do_setlink+0x2eb [kernel] rtnl_newlink+0x51e [kernel] rtnetlink_rcv_msg+0x27c [kernel] netlink_rcv_skb+0x54 [kernel] rtnetlink_rcv+0x15 [kernel] netlink_unicast+0x1ab [kernel] netlink_sendmsg+0x2d1 [kernel] sock_sendmsg+0x3e [kernel] __sys_sendto+0x13f [kernel] __x64_sys_sendto+0x28 [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] 308248 bytes in 38531 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu_gfp+0x12 [kernel] fib6_info_alloc+0x47 [kernel] addrconf_f6i_alloc+0x36 [kernel] ipv6_add_addr+0x120 [kernel] add_addr+0x73 [kernel] addrconf_notify+0x50f [kernel] notifier_call_chain+0x4c [kernel] raw_notifier_call_chain+0x16 [kernel] call_netdevice_notifiers_info+0x2d [kernel] __dev_notify_flags+0x65 [kernel] dev_change_flags+0x52 [kernel] do_setlink+0x2eb [kernel] rtnl_newlink+0x51e [kernel] rtnetlink_rcv_msg+0x27c [kernel] netlink_rcv_skb+0x54 [kernel] rtnetlink_rcv+0x15 [kernel] netlink_unicast+0x1ab [kernel] netlink_sendmsg+0x2d1 [kernel] sock_sendmsg+0x3e [kernel] __sys_sendto+0x13f [kernel] __x64_sys_sendto+0x28 [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] 1015808 bytes in 248 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu+0x15 [kernel] xt_percpu_counter_alloc+0x68 [kernel] find_check_entry.isra.7+0x55 [kernel] translate_table+0x475 [kernel] do_ipt_set_ctl+0x11a [kernel] nf_setsockopt+0x4c [kernel] ip_setsockopt+0x8a [kernel] raw_setsockopt+0x34 [kernel] sock_common_setsockopt+0x1a [kernel] __sys_setsockopt+0x86 [kernel] __x64_sys_setsockopt+0x24 [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] Although the stack with "1015808 bytes in 248 allocations from stack" is the largest this does not seem to have changed significantly since the start of the test - where it showed: 1019904 bytes in 249 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu+0x15 [kernel] xt_percpu_counter_alloc+0x68 [kernel] find_check_entry.isra.7+0x55 [kernel] translate_table+0x475 [kernel] do_ipt_set_ctl+0x11a [kernel] nf_setsockopt+0x4c [kernel] ip_setsockopt+0x8a [kernel] raw_setsockopt+0x34 [kernel] sock_common_setsockopt+0x1a [kernel] __sys_setsockopt+0x86 [kernel] __x64_sys_setsockopt+0x24 [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] The ones that look maybe interesting are the ones that look related to ipv6 - e.g.: 279016 bytes in 34877 allocations from stack pcpu_alloc+0x39f [kernel] __alloc_percpu_gfp+0x12 [kernel] fib6_info_alloc+0x47 [kernel] addrconf_f6i_alloc+0x36 [kernel] ipv6_add_addr+0x120 [kernel] addrconf_add_linklocal+0x73 [kernel] addrconf_addr_gen+0xfe [kernel] addrconf_dev_config+0x9f [kernel] addrconf_notify+0x104 [kernel] notifier_call_chain+0x4c [kernel] raw_notifier_call_chain+0x16 [kernel] call_netdevice_notifiers_info+0x2d [kernel] __dev_notify_flags+0x65 [kernel] dev_change_flags+0x52 [kernel] do_setlink+0x2eb [kernel] rtnl_newlink+0x51e [kernel] rtnetlink_rcv_msg+0x27c [kernel] netlink_rcv_skb+0x54 [kernel] rtnetlink_rcv+0x15 [kernel] netlink_unicast+0x1ab [kernel] netlink_sendmsg+0x2d1 [kernel] sock_sendmsg+0x3e [kernel] __sys_sendto+0x13f [kernel] __x64_sys_sendto+0x28 [kernel] do_syscall_64+0x5a [kernel] entry_SYSCALL_64_after_hwframe+0x44 [kernel] These show a steady climb in both the number of bytes & number of allocations throughout the test.. The one thing I don't quite understand is that the amount of memory seems really small - only ~270KB - which doesn't seem to tie up with the fact Percpu memory has increased to 1.7GB (From about 300MB at the start of the test). Here is the full output from meleak in case it's of use -> Dan McGinnes IBM Cloud - Containers performance Int Tel: 247359 Ext Tel: 01962 817359 Notes: Daniel McGinnes/UK/IBM Email: MCGINNES@xxxxxxxxxx IBM (UK) Ltd, Hursley Park,Winchester,Hampshire, SO21 2JN From: Roman Gushchin <guro@xxxxxx> To: Daniel McGinnes <MCGINNES@xxxxxxxxxx> Cc: "cgroups@xxxxxxxxxxxxxxx" <cgroups@xxxxxxxxxxxxxxx>, Nathaniel Rockwell <nrockwell@xxxxxxxxxx> Date: 27/09/2018 18:37 Subject: Re: PROBLEM: Memory leaking when running kubernetes cronjobs On Thu, Sep 27, 2018 at 05:13:43PM +0100, Daniel McGinnes wrote: > Hi Roman, > > that seems to be working. I've kicked off a test collecting memleak info > with a 10 min interval. > > Here is example output it is showing: > > [16:03:07] Top 10 stacks with outstanding allocations: > 1256 bytes in 157 allocations from stack > pcpu_alloc+0x39f [kernel] > __alloc_percpu_gfp+0x12 [kernel] > fib6_info_alloc+0x47 [kernel] > ip6_route_info_create+0x25d [kernel] > ip6_route_add+0x1d [kernel] > addrconf_prefix_route+0x10d [kernel] > addrconf_add_linklocal+0x9a [kernel] > addrconf_addr_gen+0xfe [kernel] > addrconf_dev_config+0x9f [kernel] > addrconf_notify+0x104 [kernel] > notifier_call_chain+0x4c [kernel] > raw_notifier_call_chain+0x16 [kernel] > call_netdevice_notifiers_info+0x2d [kernel] > __dev_notify_flags+0x65 [kernel] > dev_change_flags+0x52 [kernel] > do_setlink+0x2eb [kernel] > rtnl_newlink+0x51e [kernel] > rtnetlink_rcv_msg+0x27c [kernel] > netlink_rcv_skb+0x54 [kernel] > rtnetlink_rcv+0x15 [kernel] > netlink_unicast+0x1ab [kernel] > netlink_sendmsg+0x2d1 [kernel] > sock_sendmsg+0x3e [kernel] > __sys_sendto+0x13f [kernel] > __x64_sys_sendto+0x28 [kernel] > do_syscall_64+0x5a [kernel] > entry_SYSCALL_64_after_hwframe+0x44 [kernel] > 1256 bytes in 157 allocations from stack > pcpu_alloc+0x39f [kernel] > __alloc_percpu_gfp+0x12 [kernel] > fib6_info_alloc+0x47 [kernel] > ip6_route_info_create+0x25d [kernel] > ip6_route_add+0x1d [kernel] > addrconf_add_mroute+0xb5 [kernel] > addrconf_add_dev+0x4f [kernel] > addrconf_dev_config+0x83 [kernel] > addrconf_notify+0x104 [kernel] > notifier_call_chain+0x4c [kernel] > raw_notifier_call_chain+0x16 [kernel] > call_netdevice_notifiers_info+0x2d [kernel] > __dev_notify_flags+0x65 [kernel] > dev_change_flags+0x52 [kernel] > do_setlink+0x2eb [kernel] > rtnl_newlink+0x51e [kernel] > rtnetlink_rcv_msg+0x27c [kernel] > netlink_rcv_skb+0x54 [kernel] > rtnetlink_rcv+0x15 [kernel] > netlink_unicast+0x1ab [kernel] > netlink_sendmsg+0x2d1 [kernel] > sock_sendmsg+0x3e [kernel] > __sys_sendto+0x13f [kernel] > __x64_sys_sendto+0x28 [kernel] > do_syscall_64+0x5a [kernel] > entry_SYSCALL_64_after_hwframe+0x44 [kernel] > 1260 bytes in 315 allocations from stack > pcpu_alloc+0x39f [kernel] > __alloc_percpu_gfp+0x12 [kernel] > __percpu_counter_init+0x23 [kernel] > fprop_global_init+0x22 [kernel] > wb_domain_init+0x6e [kernel] > mem_cgroup_css_alloc+0x19c [kernel] > cgroup_apply_control_enable+0x13e [kernel] > cgroup_mkdir+0x22b [kernel] > kernfs_iop_mkdir+0x5f [kernel] > vfs_mkdir+0x108 [kernel] > do_mkdirat+0xe8 [kernel] > __x64_sys_mkdirat+0x1a [kernel] > do_syscall_64+0x5a [kernel] > entry_SYSCALL_64_after_hwframe+0x44 [kernel] > 1400 bytes in 175 allocations from stack > pcpu_alloc+0x39f [kernel] > __alloc_percpu_gfp+0x12 [kernel] > fib6_info_alloc+0x47 [kernel] > addrconf_f6i_alloc+0x36 [kernel] > ipv6_add_addr+0x120 [kernel] > add_addr+0x73 [kernel] > addrconf_notify+0x50f [kernel] > notifier_call_chain+0x4c [kernel] > raw_notifier_call_chain+0x16 [kernel] > call_netdevice_notifiers_info+0x2d [kernel] > __dev_notify_flags+0x65 [kernel] > dev_change_flags+0x52 [kernel] > do_setlink+0x2eb [kernel] > rtnl_newlink+0x51e [kernel] > rtnetlink_rcv_msg+0x27c [kernel] > netlink_rcv_skb+0x54 [kernel] > rtnetlink_rcv+0x15 [kernel] > netlink_unicast+0x1ab [kernel] > netlink_sendmsg+0x2d1 [kernel] > sock_sendmsg+0x3e [kernel] > __sys_sendto+0x13f [kernel] > __x64_sys_sendto+0x28 [kernel] > do_syscall_64+0x5a [kernel] > entry_SYSCALL_64_after_hwframe+0x44 [kernel] > 2520 bytes in 315 allocations from stack > pcpu_alloc+0x39f [kernel] > __alloc_percpu_gfp+0x12 [kernel] > percpu_ref_init+0x23 [kernel] > cgroup_apply_control_enable+0x183 [kernel] > cgroup_mkdir+0x22b [kernel] > kernfs_iop_mkdir+0x5f [kernel] > vfs_mkdir+0x108 [kernel] > do_mkdirat+0xe8 [kernel] > __x64_sys_mkdirat+0x1a [kernel] > do_syscall_64+0x5a [kernel] > entry_SYSCALL_64_after_hwframe+0x44 [kernel] > 2520 bytes in 315 allocations from stack > pcpu_alloc+0x39f [kernel] > __alloc_percpu_gfp+0x12 [kernel] > percpu_ref_init+0x23 [kernel] > cgroup_mkdir+0x106 [kernel] > kernfs_iop_mkdir+0x5f [kernel] > vfs_mkdir+0x108 [kernel] > do_mkdirat+0xe8 [kernel] > __x64_sys_mkdirat+0x1a [kernel] > do_syscall_64+0x5a [kernel] > entry_SYSCALL_64_after_hwframe+0x44 [kernel] > 70560 bytes in 315 allocations from stack > pcpu_alloc+0x39f [kernel] > __alloc_percpu+0x15 [kernel] > mem_cgroup_css_alloc+0xa7 [kernel] > cgroup_apply_control_enable+0x13e [kernel] > cgroup_mkdir+0x22b [kernel] > kernfs_iop_mkdir+0x5f [kernel] > vfs_mkdir+0x108 [kernel] > do_mkdirat+0xe8 [kernel] > __x64_sys_mkdirat+0x1a [kernel] > do_syscall_64+0x5a [kernel] > entry_SYSCALL_64_after_hwframe+0x44 [kernel] > 184640 bytes in 5770 allocations from stack > pcpu_alloc+0x39f [kernel] > __alloc_percpu+0x15 [kernel] > __kmem_cache_create+0x3dc [kernel] > create_cache+0xd2 [kernel] > memcg_create_kmem_cache+0x109 [kernel] > memcg_kmem_cache_create_func+0x20 [kernel] > process_one_work+0x1fd [kernel] > worker_thread+0x34 [kernel] > kthread+0x121 [kernel] > ret_from_fork+0x35 [kernel] > 309960 bytes in 315 allocations from stack > pcpu_alloc+0x39f [kernel] > __alloc_percpu+0x15 [kernel] > mem_cgroup_css_alloc+0x75 [kernel] > cgroup_apply_control_enable+0x13e [kernel] > cgroup_mkdir+0x22b [kernel] > kernfs_iop_mkdir+0x5f [kernel] > vfs_mkdir+0x108 [kernel] > do_mkdirat+0xe8 [kernel] > __x64_sys_mkdirat+0x1a [kernel] > do_syscall_64+0x5a [kernel] > entry_SYSCALL_64_after_hwframe+0x44 [kernel] > 1015808 bytes in 248 allocations from stack > pcpu_alloc+0x39f [kernel] > __alloc_percpu+0x15 [kernel] > xt_percpu_counter_alloc+0x68 [kernel] > find_check_entry.isra.7+0x55 [kernel] > translate_table+0x475 [kernel] > do_ipt_set_ctl+0x11a [kernel] > nf_setsockopt+0x4c [kernel] > ip_setsockopt+0x8a [kernel] > raw_setsockopt+0x34 [kernel] > sock_common_setsockopt+0x1a [kernel] > __sys_setsockopt+0x86 [kernel] > __x64_sys_setsockopt+0x24 [kernel] > do_syscall_64+0x5a [kernel] > entry_SYSCALL_64_after_hwframe+0x44 [kernel] Perfect! Given that cgroups are not "leaking", I'd bet on the last stack. I haven't seen anything similar in out case. Doesn't seem to be related to cgroups directly. Also, it might be interesting to gather data for longer periods of time, which also include manual dropping of caches; to be sure that it's a real leak. Thanks! Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU