On 9/24/20 10:53 PM, Andrii Nakryiko wrote:
On Thu, Sep 24, 2020 at 11:22 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
Port of tail_call_static() helper function from Cilium's BPF code base [0]
to libbpf, so others can easily consume it as well. We've been using this
in production code for some time now. The main idea is that we guarantee
that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the
JITed BPF insns with direct jumps instead of having to fall back to using
expensive retpolines. By using inline asm, we guarantee that the compiler
won't merge the call from different paths with potentially different
content of r2/r3.
We're also using __throw_build_bug() macro in different places as a neat
trick to trigger compilation errors when compiler does not remove code at
compilation time. This works for the BPF backend as it does not implement
the __builtin_trap().
[0] https://github.com/cilium/cilium/commit/f5537c26020d5297b70936c6b7d03a1e412a1035
Signed-off-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
---
tools/lib/bpf/bpf_helpers.h | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
index 1106777df00b..18b75a4c82e6 100644
--- a/tools/lib/bpf/bpf_helpers.h
+++ b/tools/lib/bpf/bpf_helpers.h
@@ -53,6 +53,38 @@
})
#endif
+/*
+ * Misc useful helper macros
+ */
+#ifndef __throw_build_bug
+# define __throw_build_bug() __builtin_trap()
+#endif
this will become part of libbpf stable API, do we want/need to expose
it? If we want to expose it, then we should probably provide a better
description.
But also curious, how is it better than _Static_assert() (see
test_cls_redirect.c), which also allows to provide a better error
message?
Need to get back to you whether that has same semantics. We use the __throw_build_bug()
also in __bpf_memzero() and friends [0] as a way to trigger a hard build bug if we hit
a default switch-case [0], so we detect unsupported sizes which are not covered by the
implementation yet. If _Static_assert (0, "foo") does the trick, we could also use that;
will check with our code base.
[0] https://github.com/cilium/cilium/blob/master/bpf/include/bpf/builtins.h
+static __always_inline void
+bpf_tail_call_static(void *ctx, const void *map, const __u32 slot)
+{
+ if (!__builtin_constant_p(slot))
+ __throw_build_bug();
+
+ /*
+ * Don't gamble, but _guarantee_ that LLVM won't optimize setting
+ * r2 and r3 from different paths ending up at the same call insn as
+ * otherwise we won't be able to use the jmpq/nopl retpoline-free
+ * patching by the x86-64 JIT in the kernel.
+ *
So the clobbering comment below is completely clear. But this one is
less clear without some sort of example situation in which bad things
happen. Do you mind providing some pseudo-C example in which the
compiler will optimize things in such a way that the tail call
patching won't happen?
The details are pretty much here [1] and mentioned at plumbers, so if we end up from
different paths with different map or const key at the same tail-call call insn, then
the record_func_key() will determine that we need to emit retpoline instead of patching
which could have occured with two tail-call call insns, for example. This helper is just
to guarantee that the latter will always happen.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d2e4c1e6c2947269346054ac8937ccfe9e0bcc6b
+ * Note on clobber list: we need to stay in-line with BPF calling
+ * convention, so even if we don't end up using r0, r4, r5, we need
+ * to mark them as clobber so that LLVM doesn't end up using them
+ * before / after the call.
+ */
+ asm volatile("r1 = %[ctx]\n\t"
+ "r2 = %[map]\n\t"
+ "r3 = %[slot]\n\t"
+ "call 12\n\t"
+ :: [ctx]"r"(ctx), [map]"r"(map), [slot]"i"(slot)
+ : "r0", "r1", "r2", "r3", "r4", "r5");
+}
+
/*
* Helper structure used by eBPF C program
* to describe BPF map attributes to libbpf loader
--
2.21.0