Re: [RFC PATCH bpf-next 1/2] bpf: Introduce global percpu data

Leon Hwang <leon.hwang@xxxxxxxxx> · Fri, 17 Jan 2025 14:24:26 +0800

On 17/1/25 07:37, Andrii Nakryiko wrote:
> On Wed, Jan 15, 2025 at 11:22 PM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:
>>
>> Hi,
>>
>> On 15/1/25 07:10, Andrii Nakryiko wrote:
>>> On Mon, Jan 13, 2025 at 7:25 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:
>>>>

[...]

>>>
>>> So I think the feature overall makes sense, but we need to think
>>> through at least libbpf's side of things some more. Unlike .data,
>>> per-cpu .data section is not mmapable, and so that has implication on
>>> BPF skeleton and we should make sure all that makes sense on BPF
>>> skeleton side. In that sense, per-cpu global data is more akin to
>>> struct_ops initialization image, which can be accessed by user before
>>> skeleton is loaded to initialize the image.
>>>
>>> There are a few things to consider. What's the BPF skeleton interface?
>>> Do we expose it as single struct and use that struct as initial image
>>> for each CPU (which means user won't be able to initialize different
>>> CPU data differently, at least not through BPF skeleton facilities)?
>>> Or do we expose this as an array of structs and let user set each CPU
>>> data independently?
>>>
>>> I feel like keeping it simple and having one image for all CPUs would
>>> cover most cases. And users can still access the underlying
>>> PERCPU_ARRAY map if they need more control.
>>
>> Agree. It is necessary to keep it simple.
>>
>>>
>>> But either way, we need tests for skeleton, making sure we NULL-out
>>> this per-cpu global data, but take it into account before the load.
>>>
>>> Also, this huge calloc for possible CPUs, I'd like to avoid it
>>> altogether for the (probably very common) zero-initialized case.
>>
>> Ack.
>>
>>>
>>> So in short, needs a bit of iteration to figure out all the
>>> interfacing issues, but makes sense overall. See some more low-level
>>> remarks below.
>>>
>>
>> It is challenging to figure out them. I'll do my best to achieve it.
>>
>>> pw-bot: cr
>>>
>>>
> 
> [...]
> 
>>>
>>>> @@ -516,6 +516,7 @@ struct bpf_struct_ops {
>>>>  };
>>>>
>>>>  #define DATA_SEC ".data"
>>>> +#define PERCPU_DATA_SEC ".data..percpu"
>>>
>>> I don't like this prefix, even if that's what we have in the kernel.
>>> Something like just ".percpu" or ".percpu_data" or ".data_percpu" is
>>> better, IMO.
>>
>> I tested ".percpu". It is OK to use it. But we have to update "bpftool
>> gen" too, which relies on these section names.
>>
>> Is it better to keep ".data" prefix, like ".data.percpu", ".data_percpu"?
>> Can keeping ".data" prefix reduce some works on bpftool, go-ebpf and
>> akin bpf loaders?
> 
> It's literally two lines of code in gen.c, and that should actually be
> a common array of known prefixes. Even if someone uses this new
> .percpu section with old bpftool nothing will break, they just won't
> have structure representing the initial per-CPU image. They will still
> have the generic map pointer in the skeleton. So I think this is
> acceptable.
> 
> I'd definitely go with a simple and less error-prone ".percpu" prefix.
> 

Being simple and less error-prone is indeed important. Let's proceed
with the ".percpu" prefix as suggested.

Thanks,
Leon