Re: [PATCH bpf-next v2 1/2] bpf, x64: Fix tailcall hierarchy

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Fri, 23 Feb 2024 08:35:28 -0800

On Fri, Feb 23, 2024 at 7:30 AM Leon Hwang <hffilwlqm@xxxxxxxxx> wrote:
>
>
>
> On 2024/2/23 12:06, Pu Lehui wrote:
> >
> >
> > On 2024/2/22 16:52, Leon Hwang wrote:
>
> [SNIP]
>
> >>   }
> >>   @@ -575,6 +574,54 @@ static void emit_return(u8 **pprog, u8 *ip)
> >>       *pprog = prog;
> >>   }
> >>   +DEFINE_PER_CPU(u32, bpf_tail_call_cnt);
> >
> > Hi Leon, the solution is really simplifies complexity. If I understand
> > correctly, this TAIL_CALL_CNT becomes the system global wise, not the
> > prog global wise, but before it was limiting the TCC of entry prog.
> >
>
> Correct. It becomes a PERCPU global variable.
>
> But, I think this solution is not robust enough.
>
> For example,
>
> time      prog1           prog1
> ==================================>
> line              prog2
>
> this is a time-line on a CPU. If prog1 and prog2 have tailcalls to run,
> prog2 will reset the tail_call_cnt on current CPU, which is used by
> prog1. As a result, when the CPU schedules from prog2 to prog1,
> tail_call_cnt on current CPU has been reset to 0, no matter whether
> prog1 incremented it.
>
> The tail_call_cnt reset issue happens too, even if PERCPU tail_call_cnt
> moves to 'struct bpf_prog_aux', i.e. one kprobe bpf prog can be
> triggered on many functions e.g. cilium/pwru. However, this moving is
> better than this solution.

kprobe progs are not preemptable.
There is bpf_prog_active that disallows any recursion.
Moving this percpu count to prog->aux should solve it.

> I think, my previous POC of 'struct bpf_prog_run_ctx' would be better.
> I'll resend it later, with some improvements.

percpu approach is still prefered, since it removes rax mess.