On Thu, 27 Apr 2023 13:58:56 +0200 Jiri Olsa <olsajiri@xxxxxxxxx> wrote: > On Thu, Apr 27, 2023 at 10:17:45AM +0900, Masami Hiramatsu (Google) wrote: > > Hi, > > > > Here is the 7th version of improve fprobe and add a basic fprobe event > > support for ftrace (tracefs) and perf. Here is the previous version. > > > > https://lore.kernel.org/all/168234755610.2210510.12133559313738141202.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > > > > This version is rebased on the latest linux-trace/for-next, fixes > > bpf_get_btf_vmlinux() return value check [6/11] and adds new BTF $retval > > type support [9/11] (I forgot to implement this feature last time). > > Also updates according to the BTF $retval type support. > > > > You can also get this series from: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git topic/fprobe-event-ext > > > > With this fprobe events, we can continue to trace function entry/exit > > even if the CONFIG_KPROBES_ON_FTRACE is not available. Since > > CONFIG_KPROBES_ON_FTRACE requires the CONFIG_DYNAMIC_FTRACE_WITH_REGS, > > it is not available if the architecture only supports > > CONFIG_DYNAMIC_FTRACE_WITH_ARGS (e.g. arm64). And that means kprobe > > events can not probe function entry/exit effectively on such architecture. > > But this problem can be solved if the dynamic events supports fprobe events > > because fprobe events doesn't use kprobe but ftrace via fprobe. > > > > FPROBE EVENTS > > ============= > > > > Fprobe events allows user to add new events on the entry and exit of kernel > > functions (which can be ftraced). Unlike kprobe events, the fprobe events > > can only probe the function entry and exit, and it can only trace the > > function args, return value, and stacks. (no registers) > > For probing function body, users can continue to use the kprobe events. > > > > The tracepoint probe events (tprobe events) also allows user to add new > > events dynamically on the tracepoint. Most of the tracepoint already has > > trace-events, so this feature is useful if you only want to know a > > specific parameter, or trace the tracepoints which has no trace-events > > (e.g. sched_*_tp tracepoints only exposes the tracepoints.) > > > > The fprobe events syntax is; > > > > f[:[GRP/][EVENT]] FUNCTION [FETCHARGS] > > f[MAXACTIVE][:[GRP/][EVENT]] FUNCTION%return [FETCHARGS] > > > > And tracepoint probe events syntax is; > > > > t[:[GRP/][EVENT]] TRACEPOINT [FETCHARGS] > > > > This series includes BTF argument support for fprobe/tracepoint events, > > and kprobe events. This allows us to fetch a specific function parameter > > by name, and all parameters by '$$args'. > > are you planning to fetch and display more complicated types in future? > like strings or dereferencing struct field from argument pointer Yes, that's on my next TODO list. The string thing is a bit problematic, but I think in most case, "const char *" is used for the strings. > > > Note that enabling this feature, you need to enable CONFIG_BPF_SYSCALL and > > confirm that your arch supports CONFIG_HAVE_FUNCTION_ARG_ACCESS_API. > > > > E.g. > > > > # echo 't kfree ptr' >> dynamic_events > > # echo 'f kfree object' >> dynamic_events > > # cat dynamic_events > > t:tracepoints/kfree kfree ptr=ptr > > f:fprobes/kfree__entry kfree object=object > > # echo 1 > events/fprobes/enable > > # echo 1 > events/tracepoints/enable > > # echo > trace > > # head -n 20 trace | tail > > # TASK-PID CPU# ||||| TIMESTAMP FUNCTION > > # | | | ||||| | | > > tail-84 [000] ..... 1324.561958: kfree__entry: (kfree+0x4/0x140) object=0xffff888006383c00 > > tail-84 [000] ...1. 1324.561961: kfree: (__probestub_kfree+0x4/0x10) ptr=0xffff888006383c00 > > tail-84 [000] ..... 1324.561988: kfree__entry: (kfree+0x4/0x140) object=0x0 > > tail-84 [000] ...1. 1324.561988: kfree: (__probestub_kfree+0x4/0x10) ptr=0x0 > > tail-84 [000] ..... 1324.561989: kfree__entry: (kfree+0x4/0x140) object=0xffff88800671e600 > > tail-84 [000] ...1. 1324.561989: kfree: (__probestub_kfree+0x4/0x10) ptr=0xffff88800671e600 > > tail-84 [000] ..... 1324.562368: kfree__entry: (kfree+0x4/0x140) object=0xffff8880065e0580 > > tail-84 [000] ...1. 1324.562369: kfree: (__probestub_kfree+0x4/0x10) ptr=0xffff8880065e0580 > > I checked with perf and record/stat/script seem to work fine with this > > # ./perf record -e 'fprobes:myprobe' > ^C[ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.162 MB perf.data (1 samples) ] > > # ./perf script > systemd-oomd 479 [001] 14550.722079: fprobes:myprobe: (ffffffff81505be0) filename=0x557b033662b0 Thanks for testing! > > perf trace seems to be off with __probe_ip for some reason: > > # ./perf trace -e 'fprobes:myprobe' > 0.000 systemd-oomd/479 fprobes:myprobe(__probe_ip: -2125440032, filename: 93986839069680) > 1.189 systemd-oomd/479 fprobes:myprobe(__probe_ip: -2125440032, filename: 93986839070144) > > but it's probably perf issue Yeah, it seems that the perf trace handles __probe_ip as signed long. Does that happen with kprobe events too? Thank you, > > thanks, > jirka -- Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>