On Thu, Aug 3, 2023 at 5:42 PM Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote: > > On Wed, 2 Aug 2023 16:44:09 +0200 > Florent Revest <revest@xxxxxxxxxxxx> wrote: > > > On Wed, Aug 2, 2023 at 1:09 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > > > > > On Tue, 1 Aug 2023 15:18:56 -0700 > > > Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > > On Tue, Aug 1, 2023 at 8:32 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > > > > > > > > > On Tue, 1 Aug 2023 11:20:36 -0400 > > > > > Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > > > > > > > > > > The solution was to come up with ftrace_regs, which just means it has all > > > > > > the registers to extract the arguments of a function and nothing more. Most > > > > > > > > > > This isn't 100% true. The ftrace_regs may hold a fully filled pt_regs. As > > > > > the FTRACE_WITH_REGS callbacks still get passed a ftrace_regs pointer. They > > > > > will do: > > > > > > > > > > void callback(..., struct ftrace_regs *fregs) { > > > > > struct pt_regs *regs = ftrace_get_regs(fregs); > > > > > > > > > > > > > > > Where ftrace_get_regs() will return the pt_regs only if it is fully filled. > > > > > If it is not, then it returns NULL. This was what the x86 maintainers > > > > > agreed with. > > > > > > > > arch/arm64/include/asm/ftrace.h:#define arch_ftrace_get_regs(regs) NULL > > > > > > > > Ouch. That's very bad. > > > > We care a lot about bpf running well on arm64. > > > > > > [ Adding Mark and Florent ] > > > > Ah, thanks Steve! That's my favorite can of worms :) I actually > > consider sending a talk proposal to the tracing MC at LPC "pt_regs - > > the good the bad and the ugly" on this very topic because I care about > > unblocking BPF "multi_kprobe" (which really is fprobe) on arm64, maybe > > it would be interesting. > > Ah, it is almost same as my talk :) Oh, I didn't know! I submitted a proposal today but if the talks have a lot of overlap maybe it's best that only you give your talk, since you're the actual maintainer :) or we could co-present if there's something I could add but I think you have all the background anyway > > I pointed this out in > > https://lore.kernel.org/all/CABRcYm+esb8J2O1v6=C+h+HSa5NxraPUgo63w7-iZj0CXbpusg@xxxxxxxxxxxxxx/#t > > when Masami proposed adding calls from fprobe to perf. If every > > subsystem makes different assumptions about "how sparse" their pt_regs > > is and they call into one another, this could lead to... interesting > > bugs. (eg: currently, we don't populate a fake pstate in ftrace_regs. > > so we'd need to fake it when creating a sparse pt_regs _for Perf_, > > knowing that Perf specifically expects this reg to be set. this would > > require a struct copy anyway and some knowledge about how the data > > will be consumed, in an arch- and subsystem- specific way) > > yeah, sorry I missed that point. I should remove it until we can fix it. Uh, I shouldn't have buried my important comments so far down the email :/ I wasn't sure whether you had missed the paragraph. > > On the other hand, untangling all code paths that come from > > trampolines (with a light regs structure) from those that come from an > > exception (with a pt_regs) could lead to a lot of duplicated code, and > > converting between each subsystem's idea of a light regs structure > > (what if perf introduces a perf_regs now ?) would be tedious and slow > > (lots of copies ?). > > This is one discussion point I think. Actually, using pt_regs in kretprobe > (and rethook) is histrical accident. Originally, it had put a kprobe on > the function return trampoline to hook it. So keep the API compatiblity > I made the hand assembled code to save the pt_regs on the stack. > > My another question is if we have the fprobe to trace (hook) the function > return, why we still need the kretprobe itself. I think we can remove > kretprobe and use fprobe exit handler, because "function" probing will > be done by fprobe, not kprobe. And then, we can simplify the kprobe > interface and clarify what it is -- "kprobe is a wrapper of software > breakpoint". And we don't need to think about duplicated code anymore :) That sounds reasonable to me > As I said, I would like to phase out the kretprobe itself because it > provides the same feature of fprobe, which is confusing. jprobe was > removed a while ago, and now kretprobe is. But we can not phase out > it at once. So I think we will keep current kretprobe trampoline on > arm64 and just add new ftrace_regs based rethook. Then remove the > API next release. (after all users including systemtap is moved) Heads up to BPF folks though since they also have BPF "kretprobe" program types which would break in a similar fashion as multi_kprobe (even though BPF kretprobe programs have also been discouraged for a while in favor of BPF fexit programs) > > > The reason I started the FTRACE_WITH_ARGS (which gave us ftrace_regs) in > > > the first place, was because of the overhead you reported to me with > > > ftrace_regs_caller and why you wanted to go the direct trampoline approach. > > > That's when I realized I could use a subset because those registers were > > > already being saved. The only reason FTRACE_WITH_REGS was created was it > > > had to supply full pt_regs (including flags) and emulate a breakpoint for > > > the kprobes interface. But in reality, nothing really needs all that. > > > > > > > It's not about access to args. > > > > pt_regs is passed from bpf prog further into all kinds of perf event > > > > functions including stack walking. > > > > If all accesses are done in BPF bytecode, we could (theoretically) > > have the verifier and JIT work together to deny accesses to > > unpopulated fields, or relocate pt_regs accesses to ftrace_regs > > accesses to keep backward compatibility with existing multi_kprobe BPF > > programs. > > Yeah, that is what I would like to suggest, and what my patch does. > (let me update rethook too, it'll be a bit tricky since I don't want > break anything) I agree with Alexei that this is an unnecessary amount of complexity in the verifier just to avoid a struct copy though. It's good to know that we _could_ do it if we really need to someday but then again, if a user chooses an interface that gets a pt_regs they shouldn't expect high performance. Therefore, I think it's ok for BPF multi_kprobe to copy fields from a ftrace_regs to a pt_regs on stack, especially if it avoids so much additional complexity in the verifier.