On Mon, Sep 28, 2020 at 7:39 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: > > Port of tail_call_static() helper function from Cilium's BPF code base [0] > to libbpf, so others can easily consume it as well. We've been using this > in production code for some time now. The main idea is that we guarantee > that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the > JITed BPF insns with direct jumps instead of having to fall back to using > expensive retpolines. By using inline asm, we guarantee that the compiler > won't merge the call from different paths with potentially different > content of r2/r3. > > We're also using Cilium's __throw_build_bug() macro (here as: __bpf_unreachable()) > in different places as a neat trick to trigger compilation errors when > compiler does not remove code at compilation time. This works for the BPF > back end as it does not implement the __builtin_trap(). > > [0] https://github.com/cilium/cilium/commit/f5537c26020d5297b70936c6b7d03a1e412a1035 > > Signed-off-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx> > Cc: Andrii Nakryiko <andriin@xxxxxx> > --- few optional nits below, but looks good to me: Acked-by: Andrii Nakryiko <andriin@xxxxxx> > tools/lib/bpf/bpf_helpers.h | 46 +++++++++++++++++++++++++++++++++++++ > 1 file changed, 46 insertions(+) > [...] > +/* > + * Helper function to perform a tail call with a constant/immediate map slot. > + */ > +static __always_inline void > +bpf_tail_call_static(void *ctx, const void *map, const __u32 slot) nit: const void *ctx would work here, right? would avoid users having to do unnecessary casts in some cases > +{ > + if (!__builtin_constant_p(slot)) > + __bpf_unreachable(); > + > + /* > + * Provide a hard guarantee that LLVM won't optimize setting r2 (map > + * pointer) and r3 (constant map index) from _different paths_ ending > + * up at the _same_ call insn as otherwise we won't be able to use the > + * jmpq/nopl retpoline-free patching by the x86-64 JIT in the kernel > + * given they mismatch. See also d2e4c1e6c294 ("bpf: Constant map key > + * tracking for prog array pokes") for details on verifier tracking. > + * > + * Note on clobber list: we need to stay in-line with BPF calling > + * convention, so even if we don't end up using r0, r4, r5, we need > + * to mark them as clobber so that LLVM doesn't end up using them > + * before / after the call. > + */ > + asm volatile("r1 = %[ctx]\n\t" > + "r2 = %[map]\n\t" > + "r3 = %[slot]\n\t" > + "call 12\n\t" nit: it's weird to have tabs at the end of each string literal, especially that r1 doesn't start with a tab... > + :: [ctx]"r"(ctx), [map]"r"(map), [slot]"i"(slot) > + : "r0", "r1", "r2", "r3", "r4", "r5"); > +} > + > /* > * Helper structure used by eBPF C program > * to describe BPF map attributes to libbpf loader > -- > 2.21.0 >