On 2024/8/27 18:37, Eduard Zingerman wrote: > On Sun, 2024-08-25 at 21:09 +0800, Leon Hwang wrote: >> This patch fixes a tailcall infinite loop issue caused by freplace. >> >> Since commit 1c123c567fb1 ("bpf: Resolve fext program type when checking map compatibility"), >> freplace prog is allowed to tail call its target prog. Then, when a >> freplace prog attaches to its target prog's subprog and tail calls its >> target prog, kernel will panic. >> [...] >> >> As a result, the tail_call_cnt is stored on the stack of entry_tc. And >> the tail_call_cnt_ptr is propagated between subprog_tc, entry_freplace, >> subprog_tail and entry_tc. >> >> Furthermore, trampoline is required to propagate >> tail_call_cnt/tail_call_cnt_ptr always, no matter whether there is >> tailcall at run time. >> So, it reuses trampoline flag "BIT(7)" to tell trampoline to propagate >> the tail_call_cnt/tail_call_cnt_ptr, as BPF_TRAMP_F_TAIL_CALL_CTX is not >> used by any other arch BPF JIT. > > This change seem to be correct. > Could you please add an explanation somewhere why nop3/xor and nop5 > are swapped in the prologue? OK, I'll explain it in message of patch v2. > > As far as I understand, this is done so that freplace program > would reuse xor generated for replaced program, is that right? > E.g. for tailcall_bpf2bpf_freplace test case disasm looks as follows: > > --------------- entry_tc -------------- > func #0: > 0: f3 0f 1e fa endbr64 > 4: 48 31 c0 xorq %rax, %rax > 7: 0f 1f 44 00 00 nopl (%rax,%rax) > c: 55 pushq %rbp > d: 48 89 e5 movq %rsp, %rbp > 10: f3 0f 1e fa endbr64 > ... > > ------------ entry_freplace ----------- > func #0: > 0: f3 0f 1e fa endbr64 > 4: 0f 1f 00 nopl (%rax) > 7: 0f 1f 44 00 00 nopl (%rax,%rax) > c: 55 pushq %rbp > d: 48 89 e5 movq %rsp, %rbp > ... > > So, if entry_freplace would be used to replace entry_tc instead > of subprog_tc, the disasm would change to: > > --------------- entry_tc -------------- > func #0: > 0: f3 0f 1e fa endbr64 > 4: 48 31 c0 xorq %rax, %rax > 7: 0f 1f 44 00 00 jmp <entry_freplace> > > Thus reusing %rax initialization from entry_tc. > Great. You understand it correctly. >> However, the bad effect is that it requires initializing tail_call_cnt at >> prologue always no matter whether the prog is tail_call_reachable, because >> it is unable to confirm itself or its subprog[s] whether to be attached by >> freplace prog. >> And, when call subprog, tail_call_cnt_ptr is required to be propagated >> to subprog always. > > This seems unfortunate. > I wonder if disallowing to freplace programs when > replacement.tail_call_reachable != replaced.tail_call_reachable > would be a better option? > This idea is wonderful. We can disallow attaching tail_call_reachable freplace prog to not-tail_call_reachable bpf prog. So, the following 3 cases are allowed. 1. attach tail_call_reachable freplace prog to tail_call_reachable bpf prog. 2. attach not-tail_call_reachable freplace prog to tail_call_reachable bpf prog. 3. attach not-tail_call_reachable freplace prog to not-tail_call_reachable bpf prog. I would like to add this limit in patch v2, that tail_call_reachable freplace is disallowed to attach to not-tail_call_reachable bpf prog. Thereafter, tail_call_cnt_ptr is required to be propagated to subprog only when subprog is tail_call_reachable. Thanks, Leon