Re: [PATCH bpf-next 4/4] selftests/bpf: Add a case to test global percpu data

Leon Hwang <leon.hwang@xxxxxxxxx> · Wed, 12 Feb 2025 09:50:56 +0800

On 11/2/25 08:15, Andrii Nakryiko wrote:
> On Mon, Feb 10, 2025 at 1:52 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:
>>
>>
>>
>> On 8/2/25 03:48, Andrii Nakryiko wrote:
>>> On Fri, Feb 7, 2025 at 2:00 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:

[...]

>>
>> Yes, I've generated it. But it should not add '__aligned(8)' to it. Or
>> bpf_map__set_initial_value() will fails because the aligned size is
>> different from the actual spec's value size.
>>
>> If the actual value size is not __aligned(8), how should we lookup
>> element from percpu_array map?
> 
> for .percpu libbpf can ensure that map is created with correct value
> size that matches __aligned(8) size? It's an error-prone corner case
> to non-multiple-of-8 size anyways (for per-CPU data), so just prevent
> the issue altogether?...
> 

Ack.

I'll update value size of .percpu maps to match __aligned(8) size, and
add '__aligned(8)' to the generated .percpu types.

>>
>> The doc[0] does not provide a good practice for this case.
>>
>> [0] https://docs.kernel.org/bpf/map_array.html#bpf-map-type-percpu-array
>>
>>>>
>>>>> And for that array access, we should make sure that it's __aligned(8),
>>>>> so indexing by CPU index works correctly.
>>>>>
>>>>
>>>> Ack.
>>>>
>>>>> Also, you define per-CPU variable as int, but here it is u64, what's
>>>>> up with that?
>>>>>
>>>>
>>>> Like __aligned(8), it's to make sure 8-bytes aligned. It's better to use
>>>> __aligned(8).
>>>
>>> It's hacky, and it won't work correctly on big-endian architectures.
>>> But you shouldn't need that if we have a struct representing this
>>> .percpu memory image. Just make sure that struct has 8 byte alignment
>>> (from bpftool side during skeleton generation, probably).
>>>
>>> [...]
>>>
>>>>> at least one of BPF programs (don't remember which one, could be
>>>>> raw_tp) supports specifying CPU index to run on, it would be nice to
>>>>> loop over CPUs, triggering BPF program on each one and filling per-CPU
>>>>> variable with current CPU index. Then we can check that all per-CPU
>>>>> values have expected values.
>>>>>
>>>>
>>>> Do you mean prog_tests/perf_buffer.c::trigger_on_cpu()?
>>>>
>>>
>>> No, look at `cpu` field of `struct bpf_test_run_opts`. We should have
>>> a selftest using it, so you can work backwards from that.
>>>
>>
>> By referencing raw_tp, which uses `opts.cpu`, if use it to test global
>> percpu data, it will fail to test on non-zero CPU, because
>> bpf_prog_test_run_skb() disallows setting `opts.cpu`.
>>
>> Then, when `setaffinity` like perf_buffer.c::trigger_on_cpu(), it's OK
>> to run the adding selftests on all CPUs.
>>
>> So, should I use `setaffinity` or change the bpf prog type from tc to
>> raw_tp to use `opts.cpu`?
> 
> Is it a problem to use raw_tp (we do it a lot)? If not, I'd go with
> raw_tp and opts.cpu.
> 

There's no problem to use raw_tp. Let us use raw_tp and opts.cpu.

Thanks,
Leon