On Mon, Feb 10, 2025 at 1:52 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote: > > > > On 8/2/25 03:48, Andrii Nakryiko wrote: > > On Fri, Feb 7, 2025 at 2:00 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote: > >> > >> > >> > >> On 6/2/25 08:09, Andrii Nakryiko wrote: > >>> On Mon, Jan 27, 2025 at 8:22 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote: > >>>> > > [...] > > >>>> +void test_global_percpu_data_init(void) > >>>> +{ > >>>> + struct test_global_percpu_data *skel = NULL; > >>>> + u64 *percpu_data = NULL; > >>> > >>> there is that test_global_percpu_data__percpu type you are declaring > >>> in the BPF skeleton, right? We should try using it here. > >>> > >> > >> No. bpftool does not generate test_global_percpu_data__percpu. The > >> struct for global variables is embedded into skeleton struct. > >> > >> Should we generate type for global variables? > > > > we already have custom skeleton-specific type for .data, .rodata, > > .bss, so we should provide one for .percpu as well, yes > > > > Yes, I've generated it. But it should not add '__aligned(8)' to it. Or > bpf_map__set_initial_value() will fails because the aligned size is > different from the actual spec's value size. > > If the actual value size is not __aligned(8), how should we lookup > element from percpu_array map? for .percpu libbpf can ensure that map is created with correct value size that matches __aligned(8) size? It's an error-prone corner case to non-multiple-of-8 size anyways (for per-CPU data), so just prevent the issue altogether?... > > The doc[0] does not provide a good practice for this case. > > [0] https://docs.kernel.org/bpf/map_array.html#bpf-map-type-percpu-array > > >> > >>> And for that array access, we should make sure that it's __aligned(8), > >>> so indexing by CPU index works correctly. > >>> > >> > >> Ack. > >> > >>> Also, you define per-CPU variable as int, but here it is u64, what's > >>> up with that? > >>> > >> > >> Like __aligned(8), it's to make sure 8-bytes aligned. It's better to use > >> __aligned(8). > > > > It's hacky, and it won't work correctly on big-endian architectures. > > But you shouldn't need that if we have a struct representing this > > .percpu memory image. Just make sure that struct has 8 byte alignment > > (from bpftool side during skeleton generation, probably). > > > > [...] > > > >>> at least one of BPF programs (don't remember which one, could be > >>> raw_tp) supports specifying CPU index to run on, it would be nice to > >>> loop over CPUs, triggering BPF program on each one and filling per-CPU > >>> variable with current CPU index. Then we can check that all per-CPU > >>> values have expected values. > >>> > >> > >> Do you mean prog_tests/perf_buffer.c::trigger_on_cpu()? > >> > > > > No, look at `cpu` field of `struct bpf_test_run_opts`. We should have > > a selftest using it, so you can work backwards from that. > > > > By referencing raw_tp, which uses `opts.cpu`, if use it to test global > percpu data, it will fail to test on non-zero CPU, because > bpf_prog_test_run_skb() disallows setting `opts.cpu`. > > Then, when `setaffinity` like perf_buffer.c::trigger_on_cpu(), it's OK > to run the adding selftests on all CPUs. > > So, should I use `setaffinity` or change the bpf prog type from tc to > raw_tp to use `opts.cpu`? Is it a problem to use raw_tp (we do it a lot)? If not, I'd go with raw_tp and opts.cpu. > > Thanks, > Leon >