On Sun, Sep 1, 2024 at 6:41 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote: > > @@ -573,10 +575,13 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t, > > /* > * See emit_prologue(), for IBT builds the trampoline hook is preceded > - * with an ENDBR instruction. > + * with an ENDBR instruction and 3 bytes tail_call_cnt initialization > + * instruction. > */ > if (is_endbr(*(u32 *)ip)) > ip += ENDBR_INSN_SIZE; > + if (is_bpf_text_address((long)ip)) > + ip += X86_POKE_EXTRA; This is a foot gun. bpf_arch_text_poke() is used not only at the beginning of the function. So unconditional ip += 3 is not just puzzling with 'what is this for', but dangerous and wasteful... > @@ -2923,6 +2930,8 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im > */ > if (is_endbr(*(u32 *)orig_call)) > orig_call += ENDBR_INSN_SIZE; > + if (is_bpf_text_address((long)orig_call)) > + orig_call += X86_POKE_EXTRA; > orig_call += X86_PATCH_SIZE; > } ..this bit needs to be hacked too... > @@ -3025,6 +3034,7 @@ static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *rw_im > /* remember return value in a stack for bpf prog to access */ > emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8); > im->ip_after_call = image + (prog - (u8 *)rw_image); > + emit_nops(&prog, X86_POKE_EXTRA); > emit_nops(&prog, X86_PATCH_SIZE); And this is just pure waste of kernel code and cpu run-time. You're adding 3 byte nop for no reason at all. See commit e21aa341785c ("bpf: Fix fexit trampoline.") that added: int err = bpf_arch_text_poke(im->ip_after_call, BPF_MOD_JUMP, NULL, im->ip_epilogue); logic that is patching bpf trampoline in the middle of it. (not at the start). Because of unconditional +=3 in bpf_arch_text_poke() every trampoline will have to waste nop3 ? No. Please fix freplace and tail call combination without this kind of unacceptable shortcuts. I very much prefer to stop hacking into JITs and trampolines because tailcalls and freplace don't work well together. We cannot completely disable that combination because libxdp is using freplace to populate chain of progs the main prog is calling and these freplace progs might be doing tailcall, but we can still prevent such infinite loop that you describe in commit log: entry_tc -> subprog_tc -> entry_freplace -> subprog_tail --tailcall-> entry_tc in the verifier without resorting to JIT hacks. pw-bot: cr