This patch set introduces global per-CPU data, similar to commit 6316f78306c1 ("Merge branch 'support-global-data'"), to reduce restrictions in C for BPF programs. With this enhancement, it becomes possible to define and use global per-CPU variables, much like the DEFINE_PER_CPU() macro in the kernel[0]. The idea stems from the bpflbr project[1], which itself was inspired by retsnoop[2]. During testing of bpflbr on the v6.6 kernel, two LBR (Last Branch Record) entries were observed related to the bpf_get_smp_processor_id() helper. Since commit 1ae6921009e5 ("bpf: inline bpf_get_smp_processor_id() helper"), the bpf_get_smp_processor_id() helper has been inlined on x86_64, reducing the overhead and consequently minimizing these two LBR records. However, the introduction of global per-CPU data offers a more robust solution. By leveraging the percpu_array map and percpu instructions, global per-CPU data can be implemented intrinsically. This feature also facilitates sharing per-CPU information between tail callers and callees or between freplace callers and callees through a shared global per-CPU variable. Previously, this was achieved using a 1-entry percpu map, which this patch set aims to improve upon. Links: [0] https://github.com/torvalds/linux/blob/fbfd64d25c7af3b8695201ebc85efe90be28c5a3/include/linux/percpu-defs.h#L114 [1] https://github.com/Asphaltt/bpflbr [2] https://github.com/anakryiko/retsnoop Leon Hwang (2): bpf: Introduce global percpu data selftests/bpf: Add a case to test global percpu data kernel/bpf/arraymap.c | 39 +++++- kernel/bpf/verifier.c | 45 +++++++ tools/lib/bpf/libbpf.c | 112 ++++++++++++++---- .../bpf/prog_tests/global_data_init.c | 83 ++++++++++++- .../bpf/progs/test_global_percpu_data.c | 21 ++++ 5 files changed, 274 insertions(+), 26 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/test_global_percpu_data.c -- 2.47.1