Hi Leon, On Mon, Jan 13, 2025 at 11:24:35PM +0800, Leon Hwang wrote: > This patch set introduces global per-CPU data, similar to commit > 6316f78306c1 ("Merge branch 'support-global-data'"), to reduce restrictions > in C for BPF programs. > > With this enhancement, it becomes possible to define and use global per-CPU > variables, much like the DEFINE_PER_CPU() macro in the kernel[0]. > > The idea stems from the bpflbr project[1], which itself was inspired by > retsnoop[2]. During testing of bpflbr on the v6.6 kernel, two LBR > (Last Branch Record) entries were observed related to the > bpf_get_smp_processor_id() helper. > > Since commit 1ae6921009e5 ("bpf: inline bpf_get_smp_processor_id() helper"), > the bpf_get_smp_processor_id() helper has been inlined on x86_64, reducing > the overhead and consequently minimizing these two LBR records. > > However, the introduction of global per-CPU data offers a more robust > solution. By leveraging the percpu_array map and percpu instructions, > global per-CPU data can be implemented intrinsically. > > This feature also facilitates sharing per-CPU information between tail > callers and callees or between freplace callers and callees through a > shared global per-CPU variable. Previously, this was achieved using a > 1-entry percpu map, which this patch set aims to improve upon. I think this would be great to have. bpftrace would've liked to use this for its recent big string support, but instead had to simulate a percpu global through regular globals. Thanks, Daniel