On Sun, Feb 05, 2023 at 06:58:00AM +0000, Yafang Shao wrote: > The bpf memory accouting has some known problems in contianer > environment, > > - The container memory usage is not consistent if there's pinned bpf > program > After the container restart, the leftover bpf programs won't account > to the new generation, so the memory usage of the container is not > consistent. This issue can be resolved by introducing selectable > memcg, but we don't have an agreement on the solution yet. See also > the discussions at https://lwn.net/Articles/905150/ . > > - The leftover non-preallocated bpf map can't be limited > The leftover bpf map will be reparented, and thus it will be limited by > the parent, rather than the container itself. Furthermore, if the > parent is destroyed, it be will limited by its parent's parent, and so > on. It can also be resolved by introducing selectable memcg. > > - The memory dynamically allocated in bpf prog is charged into root memcg > only > Nowdays the bpf prog can dynamically allocate memory, for example via > bpf_obj_new(), but it only allocate from the global bpf_mem_alloc > pool, so it will charge into root memcg only. That needs to be > addressed by a new proposal. > > So let's give the user an option to disable bpf memory accouting. > > The idea of "cgroup.memory=nobpf" is originally by Tejun[1]. > > [1]. https://lwn.net/ml/linux-mm/YxjOawzlgE458ezL@xxxxxxxxxxxxxxx/ > > Yafang Shao (5): > mm: memcontrol: add new kernel parameter cgroup.memory=nobpf > bpf: use bpf_map_kvcalloc in bpf_local_storage > bpf: introduce bpf_memcg_flags() > bpf: allow to disable bpf map memory accounting > bpf: allow to disable bpf prog memory accounting Hello Yafang! Overall the patch looks good to me, please, feel free to add Acked-by: Roman Gushchin <roman.gushchin@xxxxxxxxx> I'd squash patch (3) into (4), but up to you. Thanks!