Re: Actual tail call count limits in x86 JITs and interpreter

Daniel Borkmann <daniel@xxxxxxxxxxxxx> · Fri, 10 Sep 2021 01:20:27 +0200

On 9/10/21 12:15 AM, Johan Almbladh wrote:
I have done some investigation into this matter, and this is what I
have found. To recap, the situation is as follows.

MAX_TAIL_CALL_CNT is defined to be 32. Since the interpreter has used
33 instead, we agree to use that limit across all JIT implementations
to not break any user space program. To make sure everything uses the
same limit, we must first understand what the current state actually
is so we know what to fix.

Me: according to test_bpf.ko the tail call limit is 33 for the
interpreter, and 32 for the x86-64 JIT.
Paul: according to selftests the tail call limit is 33 for both the
interpreter and the x86-64 JIT.

Link: https://lore.kernel.org/bpf/20210809093437.876558-1-johan.almbladh@xxxxxxxxxxxxxxxxx/

I have been able to reproduce the above selftests results using
vmtest.sh. Digging deeper into this, I found that there are actually
two different code paths where the tail call count is checked in the
x86-64 JIT, corresponding to direct and indirect tail calls. By
setting different limits in those two places, I found that selftests
tailcall_3 hits the limit in emit_bpf_tail_call_direct(), whereas the
test_bpf.ko is limited by emit_bpf_tail_call_indirect().

I hacked a quick test case so that both are covered, see attached. Both
have the same behavior from my testing with and without the x86-64 JIT.
Bit late over here, will check more tomorrow, but in both cases we also
emit the same JIT code wrt counter update..

I am not 100% sure that this is the correct explanation, but it sounds
very reasonable. However, the asm generated in the two cases look very
similar to me, so by looking at that alone I cannot really see that
the limits would be different. Perhaps someone more versed in x86 asm
could take a closer look.

What are your thoughts?

Johan


>From c68f8e699a639392f24244068c1b49389f67f302 Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
Date: Thu, 9 Sep 2021 22:55:13 +0000
Subject: [PATCH bpf-next] bpf, selftests: Replicate tailcall_3 limit test for indirect call case

The tailcall_3 test program uses bpf_tail_call_static() where the JIT
would patch a direct jump. Add a new tailcall_6 test program replicating
exactly the same test just ensuring that bpf_tail_call() uses a map index
where the verifier cannot make assumptions this time. In other words this
will now cover both on x86-64 JIT, emit_bpf_tail_call_direct() emission
as well as emit_bpf_tail_call_indirect() emission.

  # ./test_progs -t tailcalls
  #136/1 tailcalls/tailcall_1:OK
  #136/2 tailcalls/tailcall_2:OK
  #136/3 tailcalls/tailcall_3:OK
  #136/4 tailcalls/tailcall_4:OK
  #136/5 tailcalls/tailcall_5:OK
  #136/6 tailcalls/tailcall_6:OK
  #136/7 tailcalls/tailcall_bpf2bpf_1:OK
  #136/8 tailcalls/tailcall_bpf2bpf_2:OK
  #136/9 tailcalls/tailcall_bpf2bpf_3:OK
  #136/10 tailcalls/tailcall_bpf2bpf_4:OK
  #136/11 tailcalls/tailcall_bpf2bpf_5:OK
  #136 tailcalls:OK
  Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
Cc: Johan Almbladh <johan.almbladh@xxxxxxxxxxxxxxxxx>
Cc: Paul Chaignon <paul@xxxxxxxxx>
Cc: Tiezhu Yang <yangtiezhu@xxxxxxxxxxx>
---
 .../selftests/bpf/prog_tests/tailcalls.c      | 25 ++++++++++---
 tools/testing/selftests/bpf/progs/tailcall6.c | 36 +++++++++++++++++++
 2 files changed, 56 insertions(+), 5 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/tailcall6.c

diff --git a/tools/testing/selftests/bpf/prog_tests/tailcalls.c b/tools/testing/selftests/bpf/prog_tests/tailcalls.c
index b5940e6ca67c..7bf3a7a97d7b 100644
--- a/tools/testing/selftests/bpf/prog_tests/tailcalls.c
+++ b/tools/testing/selftests/bpf/prog_tests/tailcalls.c
@@ -219,10 +219,7 @@ static void test_tailcall_2(void)
 	bpf_object__close(obj);
 }
 
-/* test_tailcall_3 checks that the count value of the tail call limit
- * enforcement matches with expectations.
- */
-static void test_tailcall_3(void)
+static void test_tailcall_count(const char *which)
 {
 	int err, map_fd, prog_fd, main_fd, data_fd, i, val;
 	struct bpf_map *prog_array, *data_map;
@@ -231,7 +228,7 @@ static void test_tailcall_3(void)
 	__u32 retval, duration;
 	char buff[128] = {};
 
-	err = bpf_prog_load("tailcall3.o", BPF_PROG_TYPE_SCHED_CLS, &obj,
+	err = bpf_prog_load(which, BPF_PROG_TYPE_SCHED_CLS, &obj,
 			    &prog_fd);
 	if (CHECK_FAIL(err))
 		return;
@@ -296,6 +293,22 @@ static void test_tailcall_3(void)
 	bpf_object__close(obj);
 }
 
+/* test_tailcall_3 checks that the count value of the tail call limit
+ * enforcement matches with expectations. JIT uses direct jump.
+ */
+static void test_tailcall_3(void)
+{
+	test_tailcall_count("tailcall3.o");
+}
+
+/* test_tailcall_6 checks that the count value of the tail call limit
+ * enforcement matches with expectations. JIT uses indirect jump.
+ */
+static void test_tailcall_6(void)
+{
+	test_tailcall_count("tailcall6.o");
+}
+
 /* test_tailcall_4 checks that the kernel properly selects indirect jump
  * for the case where the key is not known. Latter is passed via global
  * data to select different targets we can compare return value of.
@@ -822,6 +835,8 @@ void test_tailcalls(void)
 		test_tailcall_4();
 	if (test__start_subtest("tailcall_5"))
 		test_tailcall_5();
+	if (test__start_subtest("tailcall_6"))
+		test_tailcall_6();
 	if (test__start_subtest("tailcall_bpf2bpf_1"))
 		test_tailcall_bpf2bpf_1();
 	if (test__start_subtest("tailcall_bpf2bpf_2"))
diff --git a/tools/testing/selftests/bpf/progs/tailcall6.c b/tools/testing/selftests/bpf/progs/tailcall6.c
new file mode 100644
index 000000000000..9c298d908693
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/tailcall6.c
@@ -0,0 +1,36 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+
+#include <bpf/bpf_helpers.h>
+
+struct {
+	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
+	__uint(max_entries, 1);
+	__uint(key_size, sizeof(__u32));
+	__uint(value_size, sizeof(__u32));
+} jmp_table SEC(".maps");
+
+int count = 0;
+volatile int which;
+
+SEC("classifier/0")
+int bpf_func_0(struct __sk_buff *skb)
+{
+	count++;
+	if (__builtin_constant_p(which))
+		__bpf_unreachable();
+	bpf_tail_call(skb, &jmp_table, which);
+	return 1;
+}
+
+SEC("classifier")
+int entry(struct __sk_buff *skb)
+{
+	if (__builtin_constant_p(which))
+		__bpf_unreachable();
+	bpf_tail_call(skb, &jmp_table, which);
+	return 0;
+}
+
+char __license[] SEC("license") = "GPL";
+int _version SEC("version") = 1;
-- 
2.27.0