From: Hou Tao <houtao1@xxxxxxxxxx> Hi, The patchset aims to fix the problems found in the review of per-cpu kptr patch-set [0]. Patch #1 moves pcpu_lock after the invocation of pcpu_chunk_addr_search() and it is a micro-optimization for free_percpu(). The reason includes it in the patch is that the same logic is used in newly-added API pcpu_alloc_size(). Patch #2 introduces pcpu_alloc_size() for dynamic per-cpu area. Patch #2 and #3 use pcpu_alloc_size() to check whether or not unit_size matches with the size of underlying per-cpu area and to select a matching bpf_mem_cache. Patch #4 fixes the freeing of per-cpu kptr when these kptrs are freed by map destruction. The last patch adds test cases for these problems. Please see individual patches for details. And comments are always welcome. Change Log: v2: * add a new patch "don't acquire pcpu_lock for pcpu_chunk_addr_search()" * patch 2: change type of bit_off and end to unsigned long (Andrew) * patch 2: rename the new API as pcpu_alloc_size and follow 80-column convention (Dennis) * patch 5: move the common declaration into bpf.h (Stanislav, Alxei) v1: https://lore.kernel.org/bpf/20231007135106.3031284-1-houtao@xxxxxxxxxxxxxxx/ [0]: https://lore.kernel.org/bpf/20230827152729.1995219-1-yonghong.song@xxxxxxxxx Hou Tao (7): mm/percpu.c: don't acquire pcpu_lock for pcpu_chunk_addr_search() mm/percpu.c: introduce pcpu_alloc_size() bpf: Re-enable unit_size checking for global per-cpu allocator bpf: Use pcpu_alloc_size() in bpf_mem_free{_rcu}() bpf: Move the declaration of __bpf_obj_drop_impl() to bpf.h bpf: Use bpf_global_percpu_ma for per-cpu kptr in __bpf_obj_drop_impl() selftests/bpf: Add more test cases for bpf memory allocator include/linux/bpf.h | 1 + include/linux/bpf_mem_alloc.h | 1 + include/linux/percpu.h | 1 + kernel/bpf/helpers.c | 24 ++- kernel/bpf/memalloc.c | 38 ++-- kernel/bpf/syscall.c | 6 +- mm/percpu.c | 34 +++- .../selftests/bpf/prog_tests/test_bpf_ma.c | 20 +- .../testing/selftests/bpf/progs/test_bpf_ma.c | 180 +++++++++++++++++- 9 files changed, 269 insertions(+), 36 deletions(-) -- 2.29.2