On Fri, Feb 23, 2024 at 7:30 AM Leon Hwang <hffilwlqm@xxxxxxxxx> wrote: > > > > On 2024/2/23 12:06, Pu Lehui wrote: > > > > > > On 2024/2/22 16:52, Leon Hwang wrote: > > [SNIP] > > >> } > >> @@ -575,6 +574,54 @@ static void emit_return(u8 **pprog, u8 *ip) > >> *pprog = prog; > >> } > >> +DEFINE_PER_CPU(u32, bpf_tail_call_cnt); > > > > Hi Leon, the solution is really simplifies complexity. If I understand > > correctly, this TAIL_CALL_CNT becomes the system global wise, not the > > prog global wise, but before it was limiting the TCC of entry prog. > > > > Correct. It becomes a PERCPU global variable. > > But, I think this solution is not robust enough. > > For example, > > time prog1 prog1 > ==================================> > line prog2 > > this is a time-line on a CPU. If prog1 and prog2 have tailcalls to run, > prog2 will reset the tail_call_cnt on current CPU, which is used by > prog1. As a result, when the CPU schedules from prog2 to prog1, > tail_call_cnt on current CPU has been reset to 0, no matter whether > prog1 incremented it. > > The tail_call_cnt reset issue happens too, even if PERCPU tail_call_cnt > moves to 'struct bpf_prog_aux', i.e. one kprobe bpf prog can be > triggered on many functions e.g. cilium/pwru. However, this moving is > better than this solution. kprobe progs are not preemptable. There is bpf_prog_active that disallows any recursion. Moving this percpu count to prog->aux should solve it. > I think, my previous POC of 'struct bpf_prog_run_ctx' would be better. > I'll resend it later, with some improvements. percpu approach is still prefered, since it removes rax mess.