On Tue, 10 Sep 2024 17:37:48 -0700 Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > On Tue, Sep 10, 2024 at 5:13 PM Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote: > > > > On Tue, 10 Sep 2024 11:23:29 -0700 > > Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > + arm ML and maintainers > > > > > > On Wed, Sep 4, 2024 at 6:02 PM Andrii Nakryiko > > > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > > > > > Hey, > > > > > > > > I just recently realized that we are still missing multi-kprobe > > > > support for ARM64, which depends on CONFIG_FPROBE. And CONFIG_FPROBE > > > > seems to require CONFIG_HAVE_RETHOOK, which, it turns out, is not > > > > implemented for ARM64. > > > > > > > > It took me a while to realize what's going on, as I roughly remembered > > > > (and confirmed through lore search) that Masami's original rethook > > > > patches had arm64-specific bits. Long story short: > > > > > > > > 0f8f8030038a Revert "arm64: rethook: Add arm64 rethook implementation" > > > > 83acdce68949 arm64: rethook: Add arm64 rethook implementation > > > > > > > > The patch was landed and then reverted. I found some discussion online > > > > and it seems like the plan was to land arch-specific bits shortly > > > > after bpf-next PR. > > > > > > > > But it seems like that never happened. Why? > > > > > > > > I see s390x, RISC-V, loongarch (I'm not even mentioning x86-64) all > > > > have CONFIG_HAVE_RETHOOK, even powerpc is getting one (see [0]), it > > > > seems. How come ARM64 is the one left out? > > > > > > > > Can anyone please provide some context? And if that's just an > > > > oversight, can we prioritize landing this for ARM64 ASAP? > > > > > > > > [0] https://lore.kernel.org/bpf/20240830113131.7597-1-adubey@xxxxxxxxxxxxx/ > > > > > > > > > > Masami, Steven, > > > > > > Does Linus have to be in CC to get any reply here? Come on, it's been > > > almost a full week. > > > > Sorry about bothering you, let me check that. But I think we eventually > > You don't bother me, but I'd appreciate a bit more timely replies in > the future, if that's OK. > > > need my fprobe-on-fgraph patch which allows all architecture uses ftrace_regs > > instead of pt_regs for ftrace/fgraph users. That allows arm64 to implement > > fprobe. > > Ok, thanks for a bit more context. I understand the end goal with > fprobe-on-fgraph, but see below. > > > > > > > > > Maybe ARM64 folks have some context?... And hopefully desire to see > > > this through so that ARM64 doesn't stick out as a lesser-supported > > > platform as far as tracing goes compared to loongarch, s390x, and > > > powerpc (which just landed rethook support, see [2]). > > > > I think lesser-supported or not is not a matter, but they need to keep > > their architecutre healthy. Mark said that the current rethook > > implementation is not acceptable because arm64 can not manually generate > > I don't see Mark's reply in the link you sent. But did he refer to the > code in kprobes_trampoline.S or is it something different? Sorry, here it is: https://lkml.org/lkml/2022/4/12/2233 > > By lesser-supported I mean that a very important functionality (BPF > multi-kprobe, which relies on CONFIG_FPROBE and thus > {HAVE|CONFIG}_RETHOOK) is currently still missing. And whether x86-64 > support landed more than 2 years ago (end of March 2022), the second > practically most popular (and thus important for tools and such) ARM64 > platform still doesn't have this functionality. > > And that's limiting, BPF multi-kprobes are a huge improvement in > tooling usability. Sorry for inconvenient. But I think this transformation is really important. > So while I get the desire to have a clean and nice > end goal, and that it might take a bit longer to get everything right. > But, maybe, landing a stop-gap solution meanwhile (especially as > isolated and thus easily backportable as the patch [0] you referenced) > is an OK path forward? I had not realized that the PSTATE register was not saved correctly at that point. This is one reason why I decided to move in the current fprobe-on-fgraph direction. > > I'm just lacking full understanding on what exactly the issue is/was, > and that's why I'm asking all these questions. I'm not sure if [0] is > just broken for some subtle reason, or it is just suboptimal in some > sense (performance, code duplication, whatnot)? If [0] was not broken, I pushed it and the current pt_regs to ftrace_regs series is separated series. But it was broken. So I tried to find the correct way to fix it, and finally introduced the current fprobe on fgraph series. performance improvement is just a side effect. Thank you, > > > [0] https://lore.kernel.org/bpf/164338038439.2429999.17564843625400931820.stgit@devnote2/ > > > pt_regs. So we need to use ftrace_regs for that. > > So eventually, we need my fprobe series. > > > > https://lore.kernel.org/bpf/164338038439.2429999.17564843625400931820.stgit@devnote2/ > > > > Thank you, > > > > > > > > Note that there was already an implementation (see [1]), but for some > > > reason it never made it. > > > > > > [1] https://lore.kernel.org/bpf/164338038439.2429999.17564843625400931820.stgit@devnote2/ > > > [2] https://lore.kernel.org/bpf/172562357215.467568.2172858907419105155.b4-ty@xxxxxxxxxxxxxx/ > > > > > > > > > > > -- Andrii > > > > > > -- > > Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx> -- Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>