Re: [PATCH bpf-next 4/4] selftests/bpf: Add a case to test global percpu data

Leon Hwang <leon.hwang@xxxxxxxxx> · Mon, 10 Feb 2025 17:52:04 +0800

On 8/2/25 03:48, Andrii Nakryiko wrote:
> On Fri, Feb 7, 2025 at 2:00 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:
>>
>>
>>
>> On 6/2/25 08:09, Andrii Nakryiko wrote:
>>> On Mon, Jan 27, 2025 at 8:22 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:
>>>>

[...]

>>>> +void test_global_percpu_data_init(void)
>>>> +{
>>>> +       struct test_global_percpu_data *skel = NULL;
>>>> +       u64 *percpu_data = NULL;
>>>
>>> there is that test_global_percpu_data__percpu type you are declaring
>>> in the BPF skeleton, right? We should try using it here.
>>>
>>
>> No. bpftool does not generate test_global_percpu_data__percpu. The
>> struct for global variables is embedded into skeleton struct.
>>
>> Should we generate type for global variables?
> 
> we already have custom skeleton-specific type for .data, .rodata,
> .bss, so we should provide one for .percpu as well, yes
> 

Yes, I've generated it. But it should not add '__aligned(8)' to it. Or
bpf_map__set_initial_value() will fails because the aligned size is
different from the actual spec's value size.

If the actual value size is not __aligned(8), how should we lookup
element from percpu_array map?

The doc[0] does not provide a good practice for this case.

[0] https://docs.kernel.org/bpf/map_array.html#bpf-map-type-percpu-array

>>
>>> And for that array access, we should make sure that it's __aligned(8),
>>> so indexing by CPU index works correctly.
>>>
>>
>> Ack.
>>
>>> Also, you define per-CPU variable as int, but here it is u64, what's
>>> up with that?
>>>
>>
>> Like __aligned(8), it's to make sure 8-bytes aligned. It's better to use
>> __aligned(8).
> 
> It's hacky, and it won't work correctly on big-endian architectures.
> But you shouldn't need that if we have a struct representing this
> .percpu memory image. Just make sure that struct has 8 byte alignment
> (from bpftool side during skeleton generation, probably).
> 
> [...]
> 
>>> at least one of BPF programs (don't remember which one, could be
>>> raw_tp) supports specifying CPU index to run on, it would be nice to
>>> loop over CPUs, triggering BPF program on each one and filling per-CPU
>>> variable with current CPU index. Then we can check that all per-CPU
>>> values have expected values.
>>>
>>
>> Do you mean prog_tests/perf_buffer.c::trigger_on_cpu()?
>>
> 
> No, look at `cpu` field of `struct bpf_test_run_opts`. We should have
> a selftest using it, so you can work backwards from that.
> 

By referencing raw_tp, which uses `opts.cpu`, if use it to test global
percpu data, it will fail to test on non-zero CPU, because
bpf_prog_test_run_skb() disallows setting `opts.cpu`.

Then, when `setaffinity` like perf_buffer.c::trigger_on_cpu(), it's OK
to run the adding selftests on all CPUs.

So, should I use `setaffinity` or change the bpf prog type from tc to
raw_tp to use `opts.cpu`?

Thanks,
Leon