On 23/8/23 05:29, Alexei Starovoitov wrote: > On Mon, Aug 21, 2023 at 8:17 PM Leon Hwang <hffilwlqm@xxxxxxxxx> wrote: >> >> >> >> On 22/8/23 06:33, Alexei Starovoitov wrote: >>> On Fri, Aug 18, 2023 at 8:12 AM Leon Hwang <hffilwlqm@xxxxxxxxx> wrote: >>>> [SNIP] >>>> * sub rsp, 16 // space for skb and dev >>>> - * push rbx // temp regs to pass start time >>>> + * mov qword ptr [rbp - 40], rbx // temp regs to pass start time >>>> + * mov rax, 2 // cache number of argument to rax >>> >>> What does it mean? >> >> I think it's the corresponding instruction to the following code snippet >> in arch_prepare_bpf_trampoline(). >> >> /* Store number of argument registers of the traced function: >> * mov rax, nr_regs >> * mov QWORD PTR [rbp - nregs_off], rax >> */ >> emit_mov_imm64(&prog, BPF_REG_0, 0, (u32) nr_regs); >> emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -nregs_off); > > Ahh. I see. > The comment on top of arch_prepare_bpf_trampoline() is hopelessly obsolete. > Don't touch it in this patch set. We probably should delete it at some point > or take an effort to update it thoroughly. Got it. > Earlier recommendation to you was to update this comment: > /* Generated trampoline stack layout: > >>> >>>> + * mov qword ptr [rbp - 32], rax // save number of argument to stack >>> >>> Here // is ok since it's inside /* */ >> >> Got it. >> >>> >>>> * mov qword ptr [rbp - 16], rdi // save skb pointer to stack >>>> * mov qword ptr [rbp - 8], rsi // save dev pointer to stack >>>> * call __bpf_prog_enter // rcu_read_lock and preempt_disable >>>> @@ -2323,7 +2331,9 @@ static int invoke_bpf_mod_ret(const struct btf_func_model *m, u8 **pprog, >>>> * push rbp >>>> * mov rbp, rsp >>>> * sub rsp, 24 // space for skb, dev, return value >>>> - * push rbx // temp regs to pass start time >>>> + * mov qword ptr [rbp - 40], rbx // temp regs to pass start time >>>> + * mov rax, 2 // cache number of argument to rax >>>> + * mov qword ptr [rbp - 32], rax // save number of argument to stack >>>> * mov qword ptr [rbp - 24], rdi // save skb pointer to stack >>>> * mov qword ptr [rbp - 16], rsi // save dev pointer to stack >>>> * call __bpf_prog_enter // rcu_read_lock and preempt_disable >>>> @@ -2400,6 +2410,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i >>>> * [ ... ] >>>> * [ stack_arg2 ] >>>> * RBP - arg_stack_off [ stack_arg1 ] >>>> + * RSP [ tail_call_cnt ] BPF_TRAMP_F_TAIL_CALL_CTX >>>> */ >>>> >>>> /* room for return value of orig_call or fentry prog */ >>>> @@ -2464,6 +2475,8 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i >>>> else >>>> /* sub rsp, stack_size */ >>>> EMIT4(0x48, 0x83, 0xEC, stack_size); >>>> + if (flags & BPF_TRAMP_F_TAIL_CALL_CTX) >>>> + EMIT1(0x50); /* push rax */ >>>> /* mov QWORD PTR [rbp - rbx_off], rbx */ >>>> emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_6, -rbx_off); >>>> >>>> @@ -2516,9 +2529,15 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i >>>> restore_regs(m, &prog, regs_off); >>>> save_args(m, &prog, arg_stack_off, true); >>>> >>>> + if (flags & BPF_TRAMP_F_TAIL_CALL_CTX) >>>> + /* Before calling the original function, restore the >>>> + * tail_call_cnt from stack to rax. >>>> + */ >>>> + RESTORE_TAIL_CALL_CNT(stack_size); >>>> + >>>> if (flags & BPF_TRAMP_F_ORIG_STACK) { >>>> - emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8); >>>> - EMIT2(0xff, 0xd0); /* call *rax */ >>>> + emit_ldx(&prog, BPF_DW, BPF_REG_6, BPF_REG_FP, 8); >>>> + EMIT2(0xff, 0xd3); /* call *rbx */ // FIXME: Confirm 0xd3? >>> >>> please no FIXME like comments. >>> You have to be confident in the code you're submitting. >>> llvm-mc -triple=x86_64 -show-encoding -x86-asm-syntax=intel >>> -output-asm-variant=1 <<< 'call rbx' >> >> Got it. Thanks for the guide. >> >>> >>>> } else { >>>> /* call original function */ >>>> if (emit_rsb_call(&prog, orig_call, prog)) { [SNIP] >>>> >>>> if (prog->type == BPF_PROG_TYPE_SYSCALL) { >>>> @@ -19629,6 +19640,12 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) >>>> if (!tr) >>>> return -ENOMEM; >>>> >>>> + if (tgt_prog && tgt_prog->aux->tail_call_reachable) { >>>> + subprog = find_subprog_index(tgt_prog, btf_id); >>>> + tr->flags = subprog > 0 && tgt_prog->aux->func[subprog]->is_func ? >>>> + BPF_TRAMP_F_TAIL_CALL_CTX : 0; >>> >>> If prog has subprogs all of them will 'is_func', no? >>> What's the point of the search ? >>> Just tgt_prog->aux->tail_call_reachable and func_cnt > 0 would be enough? >> >> tgt_prog->aux->tail_call_reachable and subprog > 0 would be enough? >> It has to confirm that the attaching target is a subprog of tgt_prog instead of >> tgt_prog itself. >> >> In tail call context, when 'call' a func, tail_call_cnt will be restored to rax. >> >> static int do_jit() { >> /* call */ >> case BPF_JMP | BPF_CALL: { >> int offs; >> >> func = (u8 *) __bpf_call_base + imm32; >> if (tail_call_reachable) { >> /* mov rax, qword ptr [rbp - rounded_stack_depth - 8] */ >> EMIT3_off32(0x48, 0x8B, 0x85, >> -round_up(bpf_prog->aux->stack_depth, 8) - 8); >> /* ... */ >> } >> } >> >> As a result, when 'call' a subprog, tail_call_cnt will be transferred by rax. >> Do all of subprogs run by 'call', including not-'is_func' subprogs? > > Let me ask again. Do you see a subprog that has is_func==0 ? Oh, I get it. In jit_subprogs(), all of subprogs are 'is_func'. So, it's unnecessary to check tgt_prog->aux->func[subprog]->is_func. I'll submit a new RFC PATCH later. Thanks, Leon