On 11/7/24 08:15, Eduard Zingerman wrote: > On Mon, 2024-06-24 at 00:15 +0800, Leon Hwang wrote: >> This patch fixes a tailcall issue caused by abusing the tailcall in >> bpf2bpf feature. >> >> As we know, tail_call_cnt propagates by rax from caller to callee when >> to call subprog in tailcall context. But, like the following example, >> MAX_TAIL_CALL_CNT won't work because of missing tail_call_cnt >> back-propagation from callee to caller. >> >> \#include <linux/bpf.h> >> \#include <bpf/bpf_helpers.h> >> \#include "bpf_legacy.h" >> >> struct { >> __uint(type, BPF_MAP_TYPE_PROG_ARRAY); >> __uint(max_entries, 1); >> __uint(key_size, sizeof(__u32)); >> __uint(value_size, sizeof(__u32)); >> } jmp_table SEC(".maps"); >> >> int count = 0; >> >> static __noinline >> int subprog_tail1(struct __sk_buff *skb) >> { >> bpf_tail_call_static(skb, &jmp_table, 0); >> return 0; >> } >> >> static __noinline >> int subprog_tail2(struct __sk_buff *skb) >> { >> bpf_tail_call_static(skb, &jmp_table, 0); >> return 0; >> } >> >> SEC("tc") >> int entry(struct __sk_buff *skb) >> { >> volatile int ret = 1; >> >> count++; >> subprog_tail1(skb); >> subprog_tail2(skb); >> >> return ret; >> } >> >> char __license[] SEC("license") = "GPL"; >> >> At run time, the tail_call_cnt in entry() will be propagated to >> subprog_tail1() and subprog_tail2(). But, when the tail_call_cnt in >> subprog_tail1() updates when bpf_tail_call_static(), the tail_call_cnt >> in entry() won't be updated at the same time. As a result, in entry(), >> when tail_call_cnt in entry() is less than MAX_TAIL_CALL_CNT and >> subprog_tail1() returns because of MAX_TAIL_CALL_CNT limit, >> bpf_tail_call_static() in suprog_tail2() is able to run because the >> tail_call_cnt in subprog_tail2() propagated from entry() is less than >> MAX_TAIL_CALL_CNT. >> >> So, how many tailcalls are there for this case if no error happens? >> >> From top-down view, does it look like hierarchy layer and layer? >> >> With this view, there will be 2+4+8+...+2^33 = 2^34 - 2 = 17,179,869,182 >> tailcalls for this case. >> >> How about there are N subprog_tail() in entry()? There will be almost >> N^34 tailcalls. >> >> Then, in this patch, it resolves this case on x86_64. >> >> In stead of propagating tail_call_cnt from caller to callee, it >> propagates its pointer, tail_call_cnt_ptr, tcc_ptr for short. >> >> However, where does it store tail_call_cnt? >> >> It stores tail_call_cnt on the stack of main prog. When tail call >> happens in subprog, it increments tail_call_cnt by tcc_ptr. >> >> Meanwhile, it stores tail_call_cnt_ptr on the stack of main prog, too. >> >> And, before jump to tail callee, it has to pop tail_call_cnt and >> tail_call_cnt_ptr. >> >> Then, at the prologue of subprog, it must not make rax as >> tail_call_cnt_ptr again. It has to reuse tail_call_cnt_ptr from caller. >> >> As a result, at run time, it has to recognize rax is tail_call_cnt or >> tail_call_cnt_ptr at prologue by: >> >> 1. rax is tail_call_cnt if rax is <= MAX_TAIL_CALL_CNT. >> 2. rax is tail_call_cnt_ptr if rax is > MAX_TAIL_CALL_CNT, because a >> pointer won't be <= MAX_TAIL_CALL_CNT. >> >> Furthermore, when trampoline is the caller of bpf prog, which is >> tail_call_reachable, it is required to propagate rax through trampoline. >> >> Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT") >> Fixes: e411901c0b77 ("bpf: allow for tailcalls in BPF subprograms for x64 JIT") >> Signed-off-by: Leon Hwang <hffilwlqm@xxxxxxxxx> >> --- > > Hi Leon, > > Sorry for delayed response. > I've looked through this patch and the changes make sense to me. > One thing that helped to understand the gist of the changes, > was dumping jited program using bpftool and annotating it with comments: > https://gist.github.com/eddyz87/0d48da052e9d174b2bb84174295c4215 > Maybe consider adding something along these lines to the patch > description? Sure, I'll resend the patch with updated description along annotating comments later. Thanks, Leon > > Reviewed-by: Eduard Zingerman <eddyz87@xxxxxxxxx> > > [...]