Re: [PATCH bpf-next 4/4] selftests/bpf: Add a case to test global percpu data

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Mon, 10 Feb 2025 16:15:29 -0800

On Mon, Feb 10, 2025 at 1:52 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:
>
>
>
> On 8/2/25 03:48, Andrii Nakryiko wrote:
> > On Fri, Feb 7, 2025 at 2:00 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:
> >>
> >>
> >>
> >> On 6/2/25 08:09, Andrii Nakryiko wrote:
> >>> On Mon, Jan 27, 2025 at 8:22 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:
> >>>>
>
> [...]
>
> >>>> +void test_global_percpu_data_init(void)
> >>>> +{
> >>>> +       struct test_global_percpu_data *skel = NULL;
> >>>> +       u64 *percpu_data = NULL;
> >>>
> >>> there is that test_global_percpu_data__percpu type you are declaring
> >>> in the BPF skeleton, right? We should try using it here.
> >>>
> >>
> >> No. bpftool does not generate test_global_percpu_data__percpu. The
> >> struct for global variables is embedded into skeleton struct.
> >>
> >> Should we generate type for global variables?
> >
> > we already have custom skeleton-specific type for .data, .rodata,
> > .bss, so we should provide one for .percpu as well, yes
> >
>
> Yes, I've generated it. But it should not add '__aligned(8)' to it. Or
> bpf_map__set_initial_value() will fails because the aligned size is
> different from the actual spec's value size.
>
> If the actual value size is not __aligned(8), how should we lookup
> element from percpu_array map?

for .percpu libbpf can ensure that map is created with correct value
size that matches __aligned(8) size? It's an error-prone corner case
to non-multiple-of-8 size anyways (for per-CPU data), so just prevent
the issue altogether?...

>
> The doc[0] does not provide a good practice for this case.
>
> [0] https://docs.kernel.org/bpf/map_array.html#bpf-map-type-percpu-array
>
> >>
> >>> And for that array access, we should make sure that it's __aligned(8),
> >>> so indexing by CPU index works correctly.
> >>>
> >>
> >> Ack.
> >>
> >>> Also, you define per-CPU variable as int, but here it is u64, what's
> >>> up with that?
> >>>
> >>
> >> Like __aligned(8), it's to make sure 8-bytes aligned. It's better to use
> >> __aligned(8).
> >
> > It's hacky, and it won't work correctly on big-endian architectures.
> > But you shouldn't need that if we have a struct representing this
> > .percpu memory image. Just make sure that struct has 8 byte alignment
> > (from bpftool side during skeleton generation, probably).
> >
> > [...]
> >
> >>> at least one of BPF programs (don't remember which one, could be
> >>> raw_tp) supports specifying CPU index to run on, it would be nice to
> >>> loop over CPUs, triggering BPF program on each one and filling per-CPU
> >>> variable with current CPU index. Then we can check that all per-CPU
> >>> values have expected values.
> >>>
> >>
> >> Do you mean prog_tests/perf_buffer.c::trigger_on_cpu()?
> >>
> >
> > No, look at `cpu` field of `struct bpf_test_run_opts`. We should have
> > a selftest using it, so you can work backwards from that.
> >
>
> By referencing raw_tp, which uses `opts.cpu`, if use it to test global
> percpu data, it will fail to test on non-zero CPU, because
> bpf_prog_test_run_skb() disallows setting `opts.cpu`.
>
> Then, when `setaffinity` like perf_buffer.c::trigger_on_cpu(), it's OK
> to run the adding selftests on all CPUs.
>
> So, should I use `setaffinity` or change the bpf prog type from tc to
> raw_tp to use `opts.cpu`?

Is it a problem to use raw_tp (we do it a lot)? If not, I'd go with
raw_tp and opts.cpu.

>
> Thanks,
> Leon
>