On 2023/9/16 00:29, Zvi Effron wrote: > On Thu, Sep 14, 2023 at 7:36 AM Leon Hwang <hffilwlqm@xxxxxxxxx> wrote: >> >> It's unnecessary to check 'imm32' in both 'if' and 'else'. >> >> It's better to check it first. >> >> Meanwhile, refactor the code for 'offs' calculation. >> >> v1 -> v2: >> * Add '#define RESTORE_TAIL_CALL_CNT_INSN_SIZE 7'. >> >> Signed-off-by: Leon Hwang <hffilwlqm@xxxxxxxxx> >> --- >> arch/x86/net/bpf_jit_comp.c | 18 +++++++++--------- >> 1 file changed, 9 insertions(+), 9 deletions(-) >> >> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c >> index 2846c21d75bfa..fe0393c7e7b68 100644 >> --- a/arch/x86/net/bpf_jit_comp.c >> +++ b/arch/x86/net/bpf_jit_comp.c >> @@ -1025,6 +1025,7 @@ static void emit_shiftx(u8 **pprog, u32 dst_reg, u8 src_reg, bool is64, u8 op) >> /* mov rax, qword ptr [rbp - rounded_stack_depth - 8] */ >> #define RESTORE_TAIL_CALL_CNT(stack) \ >> EMIT3_off32(0x48, 0x8B, 0x85, -round_up(stack, 8) - 8) >> +#define RESTORE_TAIL_CALL_CNT_INSN_SIZE 7 >> >> static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image, >> int oldproglen, struct jit_context *ctx, bool jmp_padding) >> @@ -1629,17 +1630,16 @@ st: if (is_imm8(insn->off)) >> case BPF_JMP | BPF_CALL: { >> int offs; >> > > <snip> > >> + if (tail_call_reachable) >> RESTORE_TAIL_CALL_CNT(bpf_prog->aux->stack_depth); > > <snip> > >> + offs = (tail_call_reachable ? >> + RESTORE_TAIL_CALL_CNT_INSN_SIZE : 0); > > I'm not sure which is preferred style for the kernel, but this seems like it > could be replaced with I'm not sure either. > > int offs = 0; > ... > if (tail_call_reachable) { > RESTORE_TAIL_CALL_CNT(bpf_prog->aux->stack_depth); > offs = RESTORE_TAIL_CALL_CNT_INSN_SIZE; > } I considered this way once. But, I think the code of the patch is more clean. > > which is easier for my brain to follow because it reduces the branches (in C, > I assume the compiler is smart enough to optimize). It does create an > unconditional write (of 0) that could otherwise be avoided (unless the compiler > is also smart enough to optimize that). I think, as for gcc -O2, there's no diff between these two ways. Thanks, Leon > > Also not sure if the performance difference matters here. > >> + offs += x86_call_depth_emit_accounting(&prog, func); >> if (emit_call(&prog, func, image + addrs[i - 1] + offs)) >> return -EINVAL; >> break; >> >> base-commit: cbb1dbcd99b0ae74c45c4c83c6d213c12c31785c >> -- >> 2.41.0 >> >>