Re: [PATCH bpf-next 4/6] bpf, libbpf: add bpf_tail_call_static helper for bpf programs

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Thu, 24 Sep 2020 13:53:22 -0700

On Thu, Sep 24, 2020 at 11:22 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
>
> Port of tail_call_static() helper function from Cilium's BPF code base [0]
> to libbpf, so others can easily consume it as well. We've been using this
> in production code for some time now. The main idea is that we guarantee
> that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the
> JITed BPF insns with direct jumps instead of having to fall back to using
> expensive retpolines. By using inline asm, we guarantee that the compiler
> won't merge the call from different paths with potentially different
> content of r2/r3.
>
> We're also using __throw_build_bug() macro in different places as a neat
> trick to trigger compilation errors when compiler does not remove code at
> compilation time. This works for the BPF backend as it does not implement
> the __builtin_trap().
>
>   [0] https://github.com/cilium/cilium/commit/f5537c26020d5297b70936c6b7d03a1e412a1035
>
> Signed-off-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> ---
>  tools/lib/bpf/bpf_helpers.h | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>
> diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
> index 1106777df00b..18b75a4c82e6 100644
> --- a/tools/lib/bpf/bpf_helpers.h
> +++ b/tools/lib/bpf/bpf_helpers.h
> @@ -53,6 +53,38 @@
>         })
>  #endif
>
> +/*
> + * Misc useful helper macros
> + */
> +#ifndef __throw_build_bug
> +# define __throw_build_bug()   __builtin_trap()
> +#endif

this will become part of libbpf stable API, do we want/need to expose
it? If we want to expose it, then we should probably provide a better
description.

But also curious, how is it better than _Static_assert() (see
test_cls_redirect.c), which also allows to provide a better error
message?

> +
> +static __always_inline void
> +bpf_tail_call_static(void *ctx, const void *map, const __u32 slot)
> +{
> +       if (!__builtin_constant_p(slot))
> +               __throw_build_bug();
> +
> +       /*
> +        * Don't gamble, but _guarantee_ that LLVM won't optimize setting
> +        * r2 and r3 from different paths ending up at the same call insn as
> +        * otherwise we won't be able to use the jmpq/nopl retpoline-free
> +        * patching by the x86-64 JIT in the kernel.
> +        *

So the clobbering comment below is completely clear. But this one is
less clear without some sort of example situation in which bad things
happen. Do you mind providing some pseudo-C example in which the
compiler will optimize things in such a way that the tail call
patching won't happen?

> +        * Note on clobber list: we need to stay in-line with BPF calling
> +        * convention, so even if we don't end up using r0, r4, r5, we need
> +        * to mark them as clobber so that LLVM doesn't end up using them
> +        * before / after the call.
> +        */
> +       asm volatile("r1 = %[ctx]\n\t"
> +                    "r2 = %[map]\n\t"
> +                    "r3 = %[slot]\n\t"
> +                    "call 12\n\t"
> +                    :: [ctx]"r"(ctx), [map]"r"(map), [slot]"i"(slot)
> +                    : "r0", "r1", "r2", "r3", "r4", "r5");
> +}
> +
>  /*
>   * Helper structure used by eBPF C program
>   * to describe BPF map attributes to libbpf loader
> --
> 2.21.0
>