On Thu, Jun 16, 2022 at 04:45:09PM +0200, Daniel Borkmann wrote: > On 6/15/22 5:17 PM, Jakub Sitnicki wrote: > [...] > > int entry(struct __sk_buff * skb): > > 0xffffffffa0201788: nop DWORD PTR [rax+rax*1+0x0] > > 0xffffffffa020178d: xor eax,eax > > 0xffffffffa020178f: push rbp > > 0xffffffffa0201790: mov rbp,rsp > > 0xffffffffa0201793: sub rsp,0x8 > > 0xffffffffa020179a: push rax > > 0xffffffffa020179b: xor esi,esi > > 0xffffffffa020179d: mov BYTE PTR [rbp-0x1],sil > > 0xffffffffa02017a1: mov rax,QWORD PTR [rbp-0x9] !!! tail call count > > 0xffffffffa02017a8: call 0xffffffffa02017d8 !!! is at rbp-0x10 > > 0xffffffffa02017ad: leave > > 0xffffffffa02017ae: ret > > > > Fix it by rounding up the BPF stack depth to a multiple of 8, when > > calculating the tail call count offset on stack. > > > > Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT") > > Signed-off-by: Jakub Sitnicki <jakub@xxxxxxxxxxxxxx> > > --- > > arch/x86/net/bpf_jit_comp.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c > > index f298b18a9a3d..c98b8c0ed3b8 100644 > > --- a/arch/x86/net/bpf_jit_comp.c > > +++ b/arch/x86/net/bpf_jit_comp.c > > @@ -1420,8 +1420,9 @@ st: if (is_imm8(insn->off)) > > case BPF_JMP | BPF_CALL: > > func = (u8 *) __bpf_call_base + imm32; > > if (tail_call_reachable) { > > + /* mov rax, qword ptr [rbp - rounded_stack_depth - 8] */ > > EMIT3_off32(0x48, 0x8B, 0x85, > > - -(bpf_prog->aux->stack_depth + 8)); > > + -round_up(bpf_prog->aux->stack_depth, 8) - 8); > > Lgtm, great catch by the way! Indeed! Acked-by: Maciej Fijalkowski <maciej.fijalkowski@xxxxxxxxx> I was wondering if it would be possible to work only on rounded up to 8 stack depth from JIT POV since it's what we do everywhere we use it...