On Fri, Sep 6, 2024 at 7:32 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote: > > > > On 2024/9/5 17:13, Puranjay Mohan wrote: > > Xu Kuohai <xukuohai@xxxxxxxxxxxxxxx> writes: > > > >> On 8/27/2024 10:23 AM, Leon Hwang wrote: > >>> > >>> > >>> On 26/8/24 22:32, Xu Kuohai wrote: > >>>> On 8/25/2024 9:09 PM, Leon Hwang wrote: > >>>>> Like "bpf, x64: Fix tailcall infinite loop caused by freplace", the same > >>>>> issue happens on arm64, too. > >>>>> > >>> > >>> [...] > >>> > >>>> > >>>> This patch makes arm64 jited prologue even more complex. I've posted a > >>>> series [1] > >>>> to simplify the arm64 jited prologue/epilogue. I think we can fix this > >>>> issue based > >>>> on [1]. I'll give it a try. > >>>> > >>>> [1] > >>>> https://lore.kernel.org/bpf/20240826071624.350108-1-xukuohai@xxxxxxxxxxxxxxx/ > >>>> > >>> > >>> Your patch series seems great. We can fix it based on it. > >>> > >>> Please notify me if you have a successful try. > >>> > >> > >> I think the complexity arises from having to decide whether > >> to initialize or keep the tail counter value in the prologue. > >> > >> To get rid of this complexity, a straightforward idea is to > >> move the tail call counter initialization to the entry of > >> bpf world, and in the bpf world, we only increase and check > >> the tail call counter, never save/restore or set it. The > >> "entry of the bpf world" here refers to mechanisms like > >> bpf_prog_run, bpf dispatcher, or bpf trampoline that > >> allows bpf prog to be invoked from C function. > >> > >> Below is a rough POC diff for arm64 that could pass all > >> of your tests. The tail call counter is held in callee-saved > >> register x26, and is set to 0 by arch_run_bpf. > > > > I like this approach as it removes all the complexity of handling tcc in > > I like this approach, too. > > > different cases. Can we go ahead with this for arm64 and make > > arch_run_bpf a weak function and let other architectures override this > > if they want to use a similar approach to this and if other archs want to > > do something else they can skip implementing arch_run_bpf. > > > > Hi Alexei, > > What do you think about this idea? This was discussed before and no, we're not going to add an extra tcc init to bpf_prog_run and penalize everybody for this niche case.