On Wed, Jan 15, 2025 at 11:22 PM Leon Hwang <leon.hwang@xxxxxxxxx> wrote: > > Hi, > > On 15/1/25 07:10, Andrii Nakryiko wrote: > > On Mon, Jan 13, 2025 at 7:25 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote: > >> > >> This patch introduces global per-CPU data, inspired by commit > >> 6316f78306c1 ("Merge branch 'support-global-data'"). It enables the > >> definition of global per-CPU variables in BPF, similar to the > >> DEFINE_PER_CPU() macro in the kernel[0]. > >> > >> For example, in BPF, it is able to define a global per-CPU variable like > >> this: > >> > >> int percpu_data SEC(".data..percpu"); > >> > >> With this patch, tools like retsnoop[1] and bpflbr[2] can simplify their > >> BPF code for handling LBRs. The code can be updated from > >> > >> static struct perf_branch_entry lbrs[1][MAX_LBR_ENTRIES] SEC(".data.lbrs"); > >> > >> to > >> > >> static struct perf_branch_entry lbrs[MAX_LBR_ENTRIES] SEC(".data..percpu.lbrs"); > >> > >> This eliminates the need to retrieve the CPU ID using the > >> bpf_get_smp_processor_id() helper. > >> > >> Additionally, by reusing global per-CPU variables, sharing information > >> between tail callers and callees or freplace callers and callees becomes > >> simpler compared to using percpu_array maps. > >> > >> Links: > >> [0] https://github.com/torvalds/linux/blob/fbfd64d25c7af3b8695201ebc85efe90be28c5a3/include/linux/percpu-defs.h#L114 > >> [1] https://github.com/anakryiko/retsnoop > >> [2] https://github.com/Asphaltt/bpflbr > >> > >> Signed-off-by: Leon Hwang <leon.hwang@xxxxxxxxx> > >> --- > >> kernel/bpf/arraymap.c | 39 +++++++++++++- > >> kernel/bpf/verifier.c | 45 +++++++++++++++++ > >> tools/lib/bpf/libbpf.c | 112 ++++++++++++++++++++++++++++++++--------- > >> 3 files changed, 171 insertions(+), 25 deletions(-) > >> > > > > So I think the feature overall makes sense, but we need to think > > through at least libbpf's side of things some more. Unlike .data, > > per-cpu .data section is not mmapable, and so that has implication on > > BPF skeleton and we should make sure all that makes sense on BPF > > skeleton side. In that sense, per-cpu global data is more akin to > > struct_ops initialization image, which can be accessed by user before > > skeleton is loaded to initialize the image. > > > > There are a few things to consider. What's the BPF skeleton interface? > > Do we expose it as single struct and use that struct as initial image > > for each CPU (which means user won't be able to initialize different > > CPU data differently, at least not through BPF skeleton facilities)? > > Or do we expose this as an array of structs and let user set each CPU > > data independently? > > > > I feel like keeping it simple and having one image for all CPUs would > > cover most cases. And users can still access the underlying > > PERCPU_ARRAY map if they need more control. > > Agree. It is necessary to keep it simple. > > > > > But either way, we need tests for skeleton, making sure we NULL-out > > this per-cpu global data, but take it into account before the load. > > > > Also, this huge calloc for possible CPUs, I'd like to avoid it > > altogether for the (probably very common) zero-initialized case. > > Ack. > > > > > So in short, needs a bit of iteration to figure out all the > > interfacing issues, but makes sense overall. See some more low-level > > remarks below. > > > > It is challenging to figure out them. I'll do my best to achieve it. > > > pw-bot: cr > > > > [...] > > > >> @@ -516,6 +516,7 @@ struct bpf_struct_ops { > >> }; > >> > >> #define DATA_SEC ".data" > >> +#define PERCPU_DATA_SEC ".data..percpu" > > > > I don't like this prefix, even if that's what we have in the kernel. > > Something like just ".percpu" or ".percpu_data" or ".data_percpu" is > > better, IMO. > > I tested ".percpu". It is OK to use it. But we have to update "bpftool > gen" too, which relies on these section names. > > Is it better to keep ".data" prefix, like ".data.percpu", ".data_percpu"? > Can keeping ".data" prefix reduce some works on bpftool, go-ebpf and > akin bpf loaders? It's literally two lines of code in gen.c, and that should actually be a common array of known prefixes. Even if someone uses this new .percpu section with old bpftool nothing will break, they just won't have structure representing the initial per-CPU image. They will still have the generic map pointer in the skeleton. So I think this is acceptable. I'd definitely go with a simple and less error-prone ".percpu" prefix. [...]