On 2023/8/31 21:12, Leon Hwang wrote: > > > On 2023/8/31 06:49, Maciej Fijalkowski wrote: >> On Sat, Aug 26, 2023 at 12:03:12PM +0800, Leon Hwang wrote: >>> >>> >>> On 2023/8/26 01:58, Maciej Fijalkowski wrote: >>>> On Fri, Aug 25, 2023 at 10:52:15PM +0800, Leon Hwang wrote: >>>>> From commit ebf7d1f508a73871 ("bpf, x64: rework pro/epilogue and tailcall >>>>> handling in JIT"), the tailcall on x64 works better than before. >>>>> >>>>> From commit e411901c0b775a3a ("bpf: allow for tailcalls in BPF subprograms >>>>> for x64 JIT"), tailcall is able to run in BPF subprograms on x64. >>>>> >>>>> From commit 5b92a28aae4dd0f8 ("bpf: Support attaching tracing BPF program >>>>> to other BPF programs"), BPF program is able to trace other BPF programs. >>>>> >>>>> How about combining them all together? >>>>> >>>>> 1. FENTRY/FEXIT on a BPF subprogram. >>>>> 2. A tailcall runs in the BPF subprogram. >>>>> 3. The tailcall calls itself. >>>> >>>> I would be interested in seeing broken asm code TBH :) >>>> >>>>> >>>>> As a result, a tailcall infinite loop comes up. And the loop would halt >>>>> the machine. >>>>> >>>>> As we know, in tail call context, the tail_call_cnt propagates by stack >>>>> and rax register between BPF subprograms. So do it in trampolines. >>>>> >>>>> Signed-off-by: Leon Hwang <hffilwlqm@xxxxxxxxx> >>>>> --- >>>>> arch/x86/net/bpf_jit_comp.c | 32 ++++++++++++++++++++++++++------ >>>>> include/linux/bpf.h | 5 +++++ >>>>> kernel/bpf/trampoline.c | 4 ++-- >>>>> kernel/bpf/verifier.c | 30 +++++++++++++++++++++++------- >>>>> 4 files changed, 56 insertions(+), 15 deletions(-) >>>>> > > [SNIP] > >>>>> >>>>> + if (tgt_prog && tgt_prog->aux->tail_call_reachable) { >>>>> + subprog = find_subprog_index(tgt_prog, btf_id); >>>>> + tr->flags = subprog > 0 ? BPF_TRAMP_F_TAIL_CALL_CTX : 0; >>>>> + } >>>> >>>> I kinda forgot trampoline internals so please bear with me. >>>> Here you are checking actually...what? That current program is a subprog >>>> of tgt prog? My knee jerk reaction would be to propagate the >>>> BPF_TRAMP_F_TAIL_CALL_CTX based on just tail_call_reachable, but I need >>>> some more time to get my head around it again, sorry :< >>> >>> Yeah, that current program must be a subprog of tgt prog. >>> >>> For example: >>> >>> tailcall_subprog() { >>> bpf_tail_call_static(&jmp_table, 0); >>> } >>> >>> tailcall_prog() { >>> tailcall_subprog(); >>> } >>> >>> prog() { >>> bpf_tail_call_static(&jmp_table, 0); >>> } >>> >>> jmp_table populates with tailcall_prog(). >>> >>> When do fentry on prog(), there's no tail_call_cnt for fentry to >>> propagate. As we can see in emit_prologue(), fentry runs before >>> initialising tail_call_cnt. >>> >>> When do fentry on tailcall_prog()? NO, it's impossible to do fentry on >>> tailcall_prog(). Because the tailcall 'jmp' skips the fentry on >>> tailcall_prog(). >>> >>> And, when do fentry on tailcall_subprog(), fentry has to propagate >>> tail_call_cnt properly. >>> >>> In conclusion, that current program must be a subprog of tgt prog. >> >> Verifier propagates the info about tail call usage through whole call >> chain on a given prog so this doesn't really matter to me where do we >> attach fentry progs. All I'm saying is: >> >> if (tgt_prog && tgt_prog->aux->tail_call_reachable) >> tr->flags = BPF_TRAMP_F_TAIL_CALL_CTX; >> >> should be just fine. I might be missing something but with above your >> selftest does not hang my system. > > I think it's unnecessary to propagate tail call usage info when do > fentry on prog(), which is the entry of the whole tail call context. If > do propagate in this case, it's meaningless to execute two extra > instructions. Because it's harmless, I agree with you. I'll change it to if (tgt_prog && tgt_prog->aux->tail_call_reachable) tr->flags = BPF_TRAMP_F_TAIL_CALL_CTX; With this update, it's easier to understand BPF_TRAMP_F_TAIL_CALL_CTX. > > I confirm that the above selftest is able to hang VM. I copy test_progs > along with tailcall*.bpf.o to another VM, which is Ubuntu 22.04.3 with > kernel 5.15.0-82-generic, then run ./test_progs -t tailcalls. And then > the VM hangs. > > Here's the Ubuntu 22.04.3 VM info: > # uname -a > Linux hwang 5.15.0-82-generic #91-Ubuntu SMP Mon Aug 14 14:14:14 UTC > 2023 x86_64 x86_64 x86_64 GNU/Linux What I suggest here is to run the selftest in the second patch, not the above example. Thanks, Leon