On 8/2/25 03:48, Andrii Nakryiko wrote: > On Fri, Feb 7, 2025 at 2:00 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote: >> >> >> >> On 6/2/25 08:09, Andrii Nakryiko wrote: >>> On Mon, Jan 27, 2025 at 8:22 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote: >>>> [...] >>>> +void test_global_percpu_data_init(void) >>>> +{ >>>> + struct test_global_percpu_data *skel = NULL; >>>> + u64 *percpu_data = NULL; >>> >>> there is that test_global_percpu_data__percpu type you are declaring >>> in the BPF skeleton, right? We should try using it here. >>> >> >> No. bpftool does not generate test_global_percpu_data__percpu. The >> struct for global variables is embedded into skeleton struct. >> >> Should we generate type for global variables? > > we already have custom skeleton-specific type for .data, .rodata, > .bss, so we should provide one for .percpu as well, yes > Yes, I've generated it. But it should not add '__aligned(8)' to it. Or bpf_map__set_initial_value() will fails because the aligned size is different from the actual spec's value size. If the actual value size is not __aligned(8), how should we lookup element from percpu_array map? The doc[0] does not provide a good practice for this case. [0] https://docs.kernel.org/bpf/map_array.html#bpf-map-type-percpu-array >> >>> And for that array access, we should make sure that it's __aligned(8), >>> so indexing by CPU index works correctly. >>> >> >> Ack. >> >>> Also, you define per-CPU variable as int, but here it is u64, what's >>> up with that? >>> >> >> Like __aligned(8), it's to make sure 8-bytes aligned. It's better to use >> __aligned(8). > > It's hacky, and it won't work correctly on big-endian architectures. > But you shouldn't need that if we have a struct representing this > .percpu memory image. Just make sure that struct has 8 byte alignment > (from bpftool side during skeleton generation, probably). > > [...] > >>> at least one of BPF programs (don't remember which one, could be >>> raw_tp) supports specifying CPU index to run on, it would be nice to >>> loop over CPUs, triggering BPF program on each one and filling per-CPU >>> variable with current CPU index. Then we can check that all per-CPU >>> values have expected values. >>> >> >> Do you mean prog_tests/perf_buffer.c::trigger_on_cpu()? >> > > No, look at `cpu` field of `struct bpf_test_run_opts`. We should have > a selftest using it, so you can work backwards from that. > By referencing raw_tp, which uses `opts.cpu`, if use it to test global percpu data, it will fail to test on non-zero CPU, because bpf_prog_test_run_skb() disallows setting `opts.cpu`. Then, when `setaffinity` like perf_buffer.c::trigger_on_cpu(), it's OK to run the adding selftests on all CPUs. So, should I use `setaffinity` or change the bpf prog type from tc to raw_tp to use `opts.cpu`? Thanks, Leon