On Thu, Nov 14, 2019 at 07:23:46PM -0800, Alexei Starovoitov wrote: > On Fri, Nov 15, 2019 at 02:04:01AM +0100, Daniel Borkmann wrote: > > for later modifications. In ii) fixup_bpf_tail_call_direct() walks > > over the progs poke_tab, locks the tail call maps poke_mutex to > > prevent from parallel updates and patches in the right locations via > ... > > @@ -1610,6 +1671,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) > > prog->bpf_func = (void *)image; > > prog->jited = 1; > > prog->jited_len = proglen; > > + fixup_bpf_tail_call_direct(prog); > > Why not to move fixup_bpf_tail_call_direct() just before > bpf_jit_binary_lock_ro() and use simple memcpy instead of text_poke ? Thinking about it, I'll move it right into the branch before we lock ... if (!prog->is_func || extra_pass) { bpf_tail_call_fixup_direct(prog); bpf_jit_binary_lock_ro(header); } else { [...] ... and I'll add a __bpf_arch_text_poke() handler which passes in the a plain memcpy() callback instead of text_poke_bp(), so it keeps reusing most of the logic/checks from __bpf_arch_text_poke() which we also have at a later point once the program is live. > imo this logic in patch 7: > case BPF_JMP | BPF_TAIL_CALL: > + if (imm32) > + emit_bpf_tail_call_direct(&bpf_prog->aux->poke_tab[imm32 - 1], > would have been easier to understand if patch 7 and 8 were swapped. Makes sense, it's totally fine to swap them, so I'll go do that. Thanks for the feedback! Cheers, Daniel