On 11/2/25 08:15, Andrii Nakryiko wrote: > On Mon, Feb 10, 2025 at 1:52 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote: >> >> >> >> On 8/2/25 03:48, Andrii Nakryiko wrote: >>> On Fri, Feb 7, 2025 at 2:00 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote: [...] >> >> Yes, I've generated it. But it should not add '__aligned(8)' to it. Or >> bpf_map__set_initial_value() will fails because the aligned size is >> different from the actual spec's value size. >> >> If the actual value size is not __aligned(8), how should we lookup >> element from percpu_array map? > > for .percpu libbpf can ensure that map is created with correct value > size that matches __aligned(8) size? It's an error-prone corner case > to non-multiple-of-8 size anyways (for per-CPU data), so just prevent > the issue altogether?... > Ack. I'll update value size of .percpu maps to match __aligned(8) size, and add '__aligned(8)' to the generated .percpu types. >> >> The doc[0] does not provide a good practice for this case. >> >> [0] https://docs.kernel.org/bpf/map_array.html#bpf-map-type-percpu-array >> >>>> >>>>> And for that array access, we should make sure that it's __aligned(8), >>>>> so indexing by CPU index works correctly. >>>>> >>>> >>>> Ack. >>>> >>>>> Also, you define per-CPU variable as int, but here it is u64, what's >>>>> up with that? >>>>> >>>> >>>> Like __aligned(8), it's to make sure 8-bytes aligned. It's better to use >>>> __aligned(8). >>> >>> It's hacky, and it won't work correctly on big-endian architectures. >>> But you shouldn't need that if we have a struct representing this >>> .percpu memory image. Just make sure that struct has 8 byte alignment >>> (from bpftool side during skeleton generation, probably). >>> >>> [...] >>> >>>>> at least one of BPF programs (don't remember which one, could be >>>>> raw_tp) supports specifying CPU index to run on, it would be nice to >>>>> loop over CPUs, triggering BPF program on each one and filling per-CPU >>>>> variable with current CPU index. Then we can check that all per-CPU >>>>> values have expected values. >>>>> >>>> >>>> Do you mean prog_tests/perf_buffer.c::trigger_on_cpu()? >>>> >>> >>> No, look at `cpu` field of `struct bpf_test_run_opts`. We should have >>> a selftest using it, so you can work backwards from that. >>> >> >> By referencing raw_tp, which uses `opts.cpu`, if use it to test global >> percpu data, it will fail to test on non-zero CPU, because >> bpf_prog_test_run_skb() disallows setting `opts.cpu`. >> >> Then, when `setaffinity` like perf_buffer.c::trigger_on_cpu(), it's OK >> to run the adding selftests on all CPUs. >> >> So, should I use `setaffinity` or change the bpf prog type from tc to >> raw_tp to use `opts.cpu`? > > Is it a problem to use raw_tp (we do it a lot)? If not, I'd go with > raw_tp and opts.cpu. > There's no problem to use raw_tp. Let us use raw_tp and opts.cpu. Thanks, Leon