Re: [PATCH v4 3/9] bpf/btf: Add a function to search a member of a struct/union

Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx> · Tue, 8 Aug 2023 23:32:23 +0900

On Mon, 7 Aug 2023 22:48:29 +0200
Jiri Olsa <olsajiri@xxxxxxxxx> wrote:

> On Fri, Aug 04, 2023 at 12:42:06AM +0900, Masami Hiramatsu wrote:
> 
> SNIP
> 
> > > 
> > > On the other hand, untangling all code paths that come from
> > > trampolines (with a light regs structure) from those that come from an
> > > exception (with a pt_regs) could lead to a lot of duplicated code, and
> > > converting between each subsystem's idea of a light regs structure
> > > (what if perf introduces a perf_regs now ?) would be tedious and slow
> > > (lots of copies ?).
> > 
> > This is one discussion point I think. Actually, using pt_regs in kretprobe
> > (and rethook) is histrical accident. Originally, it had put a kprobe on
> > the function return trampoline to hook it. So keep the API compatiblity
> > I made the hand assembled code to save the pt_regs on the stack.
> > 
> > My another question is if we have the fprobe to trace (hook) the function
> > return, why we still need the kretprobe itself. I think we can remove
> > kretprobe and use fprobe exit handler, because "function" probing will
> > be done by fprobe, not kprobe. And then, we can simplify the kprobe
> > interface and clarify what it is -- "kprobe is a wrapper of software
> > breakpoint". And we don't need to think about duplicated code anymore :)
> 
> 1+ sounds like good idea

Thanks! the downside will be that it requires to enable CONFIG_FPROBE
instead of CONFIG_KPROBES, but I think it is natural that the user, who 
wants to trace function boundary, enables CONFIG_FUNCTION_TRACER.

> > > > Otherwise, ftrace_regs() has support on arm64 for getting to the argument
> > > > registers and the stack. Even live kernel patching now uses ftrace_regs().
> > > >
> > > > >
> > > > > If you guys decide to convert fprobe to ftrace_regs please
> > > > > make it depend on kconfig or something.
> > > > > bpf side needs full pt_regs.
> > > 
> > > Some wild ideas that I brought up once in a BPF office hour: BPF
> > > "multi_kprobe" could provide a fake pt_regs (either by constructing a
> > > sparse one on the stack or by JIT-ing different offset accesses and/or
> > > by having the verifier deny access to unpopulated fields) or break the
> > > current API (is it conceivable to phase out BPF "multi_kprobe"
> > > programs in favor of BPF "fprobe" programs that don't lie about their
> > > API and guarantees and just provide a ftrace_regs ?)
> > 
> > +1 :)
> 
> so multi_kprobe link was created to allow fast attach of BPF kprobe-type
> programs to multiple functions.. I don't think there's need for new fprobe
> program

Ah, OK. So the focus point is shortening registration time.

> 
> > 
> > > 
> > > > Then use kprobes. When I asked Masami what the difference between fprobes
> > > > and kprobes was, he told me that it would be that it would no longer rely
> > > > on the slower FTRACE_WITH_REGS. But currently, it still does.
> > > 
> > > Actually... Moving fprobe to ftrace_regs should get even more spicy!
> > > :) Fprobe also wraps "rethook" which is basically the same thing as
> > > kretprobe: a return trampoline that saves a pt_regs, to the point that
> > > on x86 kretprobe's trampoline got dropped in favor of rethook's
> > > trampoline. But for the same reasons that we don't want ftrace to save
> > > pt_regs on arm64, rethook should probably also just save a ftrace_regs
> > > ? (also, to keep the fprobe callback signatures consistent between
> > > pre- and post- handlers). But if we want fprobe "post" callbacks to
> > > save a ftrace_regs now, either we need to re-introduce the kretprobe
> > > trampoline or also change the API of kretprobe (and break its symmetry
> > > with kprobe and we'd have the same problem all over again with BPF
> > > kretprobe program types...). All of this is "beautifully" entangled...
> > > :)
> > 
> > As I said, I would like to phase out the kretprobe itself because it
> > provides the same feature of fprobe, which is confusing. jprobe was
> > removed a while ago, and now kretprobe is. But we can not phase out
> > it at once. So I think we will keep current kretprobe trampoline on
> > arm64 and just add new ftrace_regs based rethook. Then remove the
> > API next release. (after all users including systemtap is moved) 
> > 
> > > 
> > > > The reason I started the FTRACE_WITH_ARGS (which gave us ftrace_regs) in
> > > > the first place, was because of the overhead you reported to me with
> > > > ftrace_regs_caller and why you wanted to go the direct trampoline approach.
> > > > That's when I realized I could use a subset because those registers were
> > > > already being saved. The only reason FTRACE_WITH_REGS was created was it
> > > > had to supply full pt_regs (including flags) and emulate a breakpoint for
> > > > the kprobes interface. But in reality, nothing really needs all that.
> > > >
> > > > > It's not about access to args.
> > > > > pt_regs is passed from bpf prog further into all kinds of perf event
> > > > > functions including stack walking.
> > > 
> > > If all accesses are done in BPF bytecode, we could (theoretically)
> > > have the verifier and JIT work together to deny accesses to
> > > unpopulated fields, or relocate pt_regs accesses to ftrace_regs
> > > accesses to keep backward compatibility with existing multi_kprobe BPF
> > > programs.
> > 
> > Yeah, that is what I would like to suggest, and what my patch does.
> > (let me update rethook too, it'll be a bit tricky since I don't want
> > break anything) 
> > 
> > Thanks,
> > 
> > > 
> > > Is there a risk that a "multi_kprobe" program could call into a BPF
> > > helper or kfunc that reads this pt_regs pointer and expect certain
> > > fields to be set ? I suppose we could also deny giving that "pt_regs"
> > > pointer to a helper... :/
> 
> I think Alexei answered this earlier in the thread:
> 
>  >From bpf side we don't care that such pt_regs is 100% filled in or
>  >only partial as long as this pt_regs pointer is valid for perf_event_output
>  >and stack walking that consume pt_regs.
>  >I believe that was and still is the case for both x86 and arm64.

OK, so I've made the ftrace_partial_regs() according to the idea now.

Thanks,

> 
> 
> jirka

-- 
Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>