This patchset fixes a tailcall hierarchy issue. The issue is confirmed in the discussions of "bpf, x64: Fix tailcall infinite loop"[0]. The issue has been resolved on both x86_64 and arm64[1]. I provide a long commit message in the second patch to describe how the issue happens and how this patchset resolves the issue in details. How does this patchset resolve the issue? In short, it stores tail_call_cnt and tail_call_cnt_ptr on the stack of main prog. First, at the prologue of main prog, it initializes tail_call_cnt and prepares tail_call_cnt_ptr. And at the prologue of subprog, it reuse the tail_call_cnt_ptr from caller. Then, when a tailcall happens, it increments tail_call_cnt by its pointer. v4 -> v5: * Solution changes from tailcall run ctx to tail_call_cnt and its pointer. It's because v4 solution is unable to handle the case that there is no tailcall in subprog but there is tailcall in EXT prog which attaches to the subprog. v3 -> v4: * Solution changes from per-task tail_call_cnt to tailcall run ctx. As for per-cpu/per-task solution, there is a case it is unable to handle[2]. v2 -> v3: * Solution changes from percpu tail_call_cnt to tail_call_cnt at task_struct. v1 -> v2: * Solution changes from extra run-time call insn to percpu tail_call_cnt. * Address comments from Alexei: * Use percpu tail_call_cnt. * Use asm to make sure no callee saved registers are touched. RFC v2 -> v1: * Solution changes from propagating tail_call_cnt with its pointer to extra run-time call insn. * Address comments from Maciej: * Replace all memcpy(prog, x86_nops[5], X86_PATCH_SIZE) with emit_nops(&prog, X86_PATCH_SIZE) RFC v1 -> RFC v2: * Address comments from Stanislav: * Separate moving emit_nops() as first patch. Links: [0] https://lore.kernel.org/bpf/6203dd01-789d-f02c-5293-def4c1b18aef@xxxxxxxxx/ [1] https://github.com/kernel-patches/bpf/pull/7244/checks [2] https://lore.kernel.org/bpf/CAADnVQK1qF+uBjwom2s2W-yEmgd_3rGi5Nr+KiV3cW0T+UPPfA@xxxxxxxxxxxxxx/ Leon Hwang (3): bpf, x64: Fix tailcall hierarchy bpf, arm64: Fix tailcall hierarchy selftests/bpf: Add testcases for tailcall hierarchy fixing arch/arm64/net/bpf_jit_comp.c | 57 ++- arch/x86/net/bpf_jit_comp.c | 107 +++- .../selftests/bpf/prog_tests/tailcalls.c | 479 ++++++++++++++++++ .../bpf/progs/tailcall_bpf2bpf_hierarchy1.c | 34 ++ .../bpf/progs/tailcall_bpf2bpf_hierarchy2.c | 55 ++ .../bpf/progs/tailcall_bpf2bpf_hierarchy3.c | 46 ++ .../progs/tailcall_bpf2bpf_hierarchy_fentry.c | 35 ++ tools/testing/selftests/bpf/progs/tc_dummy.c | 12 + 8 files changed, 781 insertions(+), 44 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy1.c create mode 100644 tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy2.c create mode 100644 tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy3.c create mode 100644 tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy_fentry.c create mode 100644 tools/testing/selftests/bpf/progs/tc_dummy.c -- 2.44.0