The bpf memory accouting has some known problems in contianer environment, - The container memory usage is not consistent if there's pinned bpf program After the container restart, the leftover bpf programs won't account to the new generation, so the memory usage of the container is not consistent. This issue can be resolved by introducing selectable memcg, but we don't have an agreement on the solution yet. See also the discussions at https://lwn.net/Articles/905150/ . - The leftover non-preallocated bpf map can't be limited The leftover bpf map will be reparented, and thus it will be limited by the parent, rather than the container itself. Furthermore, if the parent is destroyed, it be will limited by its parent's parent, and so on. It can also be resolved by introducing selectable memcg. - The memory dynamically allocated in bpf prog is charged into root memcg only Nowdays the bpf prog can dynamically allocate memory, for example via bpf_obj_new(), but it only allocate from the global bpf_mem_alloc pool, so it will charge into root memcg only. That needs to be addressed by a new proposal. So let's give the container user an option to disable bpf memory accouting. The idea of "cgroup.memory=nobpf" is originally by Tejun[1]. [1]. https://lwn.net/ml/linux-mm/YxjOawzlgE458ezL@xxxxxxxxxxxxxxx/ Changes, v1->v2: - squash patches (Roman) - commit log improvement in patch #2. (Johannes) Yafang Shao (4): mm: memcontrol: add new kernel parameter cgroup.memory=nobpf bpf: use bpf_map_kvcalloc in bpf_local_storage bpf: allow to disable bpf map memory accounting bpf: allow to disable bpf prog memory accounting Documentation/admin-guide/kernel-parameters.txt | 1 + include/linux/bpf.h | 16 ++++++++++++++++ include/linux/memcontrol.h | 11 +++++++++++ kernel/bpf/bpf_local_storage.c | 4 ++-- kernel/bpf/core.c | 13 +++++++------ kernel/bpf/memalloc.c | 3 ++- kernel/bpf/syscall.c | 20 ++++++++++++++++++-- mm/memcontrol.c | 18 ++++++++++++++++++ 8 files changed, 75 insertions(+), 11 deletions(-) -- 1.8.3.1