Re: [PATCH bpf-next v2] selftests/bpf: use auto-dependencies for test objects

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Wed, 17 Jul 2024 09:41:29 -0700

On Tue, Jul 16, 2024 at 4:21 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> On Sun, Jul 14, 2024 at 6:17 PM Ihor Solodrai <ihor.solodrai@xxxxx> wrote:
> > On Friday, July 12th, 2024 at 12:06 PM, Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
> > > So, bear with me for a moment please.
> > > We have 3 test_progs/smth.c files that depend on a few .bpf.o files at runtime,
> > > but do not include skel files for those .bpf.o, namely:
> > > - core_reloc.c
> > > - verifier_bitfield_write.c
> > > - pinning.c
> >
> > Is this an exhaustive list, or did you mean it just as an example?
>
> My bad, I looked only at the tests that failed on CI, there are indeed
> more dependencies.
>
> On Mon, 2024-07-15 at 10:44 -0700, Andrii Nakryiko wrote:
> > But I think the right direction is to get rid of file-based loading of
> > .bpf.o in favor of a) proper skeleton usage (lots of work to rewrite
> > portions of file, so not very hopeful here) or b) a quick-fix-like
> > equivalent and pretty straightforward <skel>___elf_bytes() replacement
> > everywhere where we currently load the same bytes from file on the
> > disk.
>
> I don't really see a point in migrating tests to use skels or
> elf_bytes if such migration does not simplify the test case itself.

Hm... "simplify tests" isn't the goal of this change. The goal is to
speed up the build process (while not breaking dependencies). So I
don't see simplification of any kind as a requirement. I'd say we
shouldn't complicate tests (too much) just for this, but some light
changes seem fine to me.

> By test simplification I mean at-least removal of some
> bpf_object__find_{map,program}_by_name() calls.

Some tests are generic and need (or at least are more natural)
lookup-by-name kind of APIs. Sure we can completely rewrite tests, but
why?

>
> I looked through the grep results and sorted those into buckets:
> - test changes don't simplify anything:

[...]

> Given the large number of test cases that don't seem to benefit from
> skel rework, I think we would need to handle direct dependencies
> somehow, e.g.:
> - by writing down these dependencies in the makefile, or

meh, if we can avoid this

> - by adding "fake" #include <smth.skel.h>, or

sure, "good enough"

> - by adding "true" #include <smth.skel.h> and using elf_bytes, or

better, because we actually have compile-time check that those
skeletons are used (and all necessary skeletons are used)

and with minimal code and logic changes

> - by adding a catch-all clause in the makefile, e.g. making test
>   runner depend on all .bpf.o files.

do we actually need to rebuild final binary if we are still just
loading .bpf.o from disk? We are not embedding such .bpf.o (embedding
is what skeleton headers are adding), so why rebuild .bpf.o?

Actually thinking about this again, I guess, if we don't want to add
skel.h to track precise dependencies, we don't really need to do
anything extra for those progs/*.c files that are not used through
skeletons. We just need to make sure that they are rebuilt if they are
changed. The rest will work as is because test runner binary will just
load them from disk at the next run (and user space part doesn't have
to be rebuilt, unless it itself changed).

>
> On Mon, 2024-07-15 at 10:44 -0700, Andrii Nakryiko wrote:
> > see above, elf_bytes is a quick and dirty way to get rid of file
> > dependencies and turn them into .skel.h dependency without having to
> > change existing tests significantly (which otherwise would be tons of
> > work).
>
> I assume that the goal here is to encode dependencies via skel.h files
> inclusion. For bpf selftests presence of skel.h guarantees presence of
> the freshly built object file. Why bother with elf_bytes rework if
> just including the skel files would be sufficient?

see above, just because there is no guarantee that we use all the
dependencies and we didn't miss any. It's not a high risk, but it's
also trivial to switch to elf_bytes.

another side benefit of completely switching to .skel.h is that we can
stop copying all .bpf.o files into BPF CI, because test_progs will be
self-contained (thought that's not 100% true due to btf__* and maybe a
few files more, which is sad and a bit different problem)

>
>   ---
>
> The catch-all clause in the current makefile looks as follows:
>
>     $(TRUNNER_BPF_OBJS): $(TRUNNER_OUTPUT)/%.bpf.o:                             \
>                      $(TRUNNER_BPF_PROGS_DIR)/%.c                       \
>                      $(TRUNNER_BPF_PROGS_DIR)/*.h                       \
>                          ...

keep in mind that we do want to rebuild .bpf.o if libbpf's BPF-side
headers changed, so let's make sure that stays (or happens, if we
don't do it already)

>
> This makes all .bpf.o files dependent on all BPF .c files.
> .bpf.o files rebuild is the main time consumer, at-least for me.
>
> I think that simply replacing this catch all by something like:
>
>     $(OUTPUT)/$(TRUNNER_BINARY): $(TRUNNER_BPF_OBJS)
>
> on top of v2 of this patch-set is a good stop gap solution:
> it is simple, explicit and brings most of the speedup.
> People rebuilding test_progs would only pay for compilation of BPF
> object files that had been changed.
>
> ---
>
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index 557078f2cf74..11316ccb5556 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -628,13 +628,16 @@ ifneq ($2:$(OUTPUT),:$(shell pwd))
>         $(Q)rsync -aq $$^ $(TRUNNER_OUTPUT)/
>  endif
>
> +# some X.test.o files have runtime dependencies on Y.bpf.o files
> +$(OUTPUT)/$(TRUNNER_BINARY): $(TRUNNER_BPF_OBJS)
> +
>  $(OUTPUT)/$(TRUNNER_BINARY): $(TRUNNER_TEST_OBJS)                      \
>                              $(TRUNNER_EXTRA_OBJS) $$(BPFOBJ)           \
>                              $(RESOLVE_BTFIDS)                          \
>                              $(TRUNNER_BPFTOOL)                         \
>                              | $(TRUNNER_BINARY)-extras
>         $$(call msg,BINARY,,$$@)
> -       $(Q)$$(CC) $$(CFLAGS) $$(filter %.a %.o,$$^) $$(LDLIBS) -o $$@
> +       $(Q)$$(CC) $$(CFLAGS) $$(filter-out %.bpf.o, $$(filter %.a %.o,$$^)) $$(LDLIBS) -o $$@
>         $(Q)$(RESOLVE_BTFIDS) --btf $(TRUNNER_OUTPUT)/btf_data.bpf.o $$@
>         $(Q)ln -sf $(if $2,..,.)/tools/build/bpftool/$(USE_BOOTSTRAP)bpftool \
>                    $(OUTPUT)/$(if $2,$2/)bpftool