On Thu, Jun 16, 2022 at 05:22 PM +02, Daniel Borkmann wrote: > On 6/16/22 5:00 PM, Maciej Fijalkowski wrote: >> On Thu, Jun 16, 2022 at 01:02:52PM +0200, Jakub Sitnicki wrote: >>> While working aarch64 JIT to allow mixing bpf2bpf calls with tailcalls, I >>> noticed unexpected tailcall behavior in x86 JIT. >>> >>> I don't know if it is by design or a bug. The bpf_tail_call helper >>> documentation says that the user should not expect the control flow to >>> return to the previous program, if the tail call was successful: >>> >>>> If the call succeeds, the kernel immediately runs the first >>>> instruction of the new program. This is not a function call, >>>> and it never returns to the previous program. >>> >>> However, when a tailcall happens from a subprogram, that is after a bpf2bpf >>> call, that is not the case. We return to the caller program because the >>> stack destruction is too shallow. BPF stack of just the top-most BPF >>> function gets destroyed. >>> >>> This in turn allows the return value of the tailcall'ed program to get >>> overwritten, as the test below test demonstrates. It currently fails on >>> x86: >> Disclaimer: some time has passed by since I looked into this :P >> To me the bug would be if test would have returned 1 in your case. If I >> recall correctly that was the design choice, so tailcalls when mixed with >> bpf2bpf will consume current stack frame. When tailcall happens from >> subprogram then we would return to the caller of this subprog. We added >> logic to verifier that checks if this (tc + bpf2bpf) mix wouldn't cause >> stack overflow. We even limit the stack frame size to 256 in such case. > > Yes, that is the desired behavior, so return 2 from your example below looks > correct / expected: > > +SEC("tc") > +int classifier_0(struct __sk_buff *skb __unused) > +{ > + done = 1; > + return 0; > +} > + > +static __noinline > +int subprog_tail(struct __sk_buff *skb) > +{ > + bpf_tail_call_static(skb, &jmp_table, 0); > + return 1; > +} > + > +SEC("tc") > +int entry(struct __sk_buff *skb) > +{ > + subprog_tail(skb); > + return 2; > +} Great. Thanks for confirming. Since I have the test ready, I might as well submit it. I think the case of ignoring the tailcall result is not covered yet. Also, this makes changes needed to support bpf2bpf+tailcalls on arm64 simpler. Will post soon. > >> Cilium docs explain this: >> https://docs.cilium.io/en/latest/bpf/#bpf-to-bpf-calls