On Mon, Feb 7, 2022 at 1:58 PM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Mon, Feb 7, 2022 at 6:31 AM Hengqi Chen <hengqi.chen@xxxxxxxxx> wrote: > > > > Add syscall-specific variant of BPF_KPROBE named BPF_KPROBE_SYSCALL ([0]). > > The new macro hides the underlying way of getting syscall input arguments. > > With the new macro, the following code: > > > > SEC("kprobe/__x64_sys_close") > > int BPF_KPROBE(do_sys_close, struct pt_regs *regs) > > { > > int fd; > > > > fd = PT_REGS_PARM1_CORE(regs); > > /* do something with fd */ > > } > > > > can be written as: > > > > SEC("kprobe/__x64_sys_close") > > int BPF_KPROBE_SYSCALL(do_sys_close, int fd) > > { > > /* do something with fd */ > > } > > > > [0] Closes: https://github.com/libbpf/libbpf/issues/425 > > > > Signed-off-by: Hengqi Chen <hengqi.chen@xxxxxxxxx> > > --- > > tools/lib/bpf/bpf_tracing.h | 33 +++++++++++++++++++++++++++++++++ > > 1 file changed, 33 insertions(+) > > > > diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h > > index cf980e54d331..7ad9cdea99e1 100644 > > --- a/tools/lib/bpf/bpf_tracing.h > > +++ b/tools/lib/bpf/bpf_tracing.h > > @@ -461,4 +461,37 @@ typeof(name(0)) name(struct pt_regs *ctx) \ > > } \ > > static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args) > > > > +#define ___bpf_syscall_args0() ctx > > +#define ___bpf_syscall_args1(x) ___bpf_syscall_args0(), (void *)PT_REGS_PARM1_CORE_SYSCALL(regs) > > +#define ___bpf_syscall_args2(x, args...) ___bpf_syscall_args1(args), (void *)PT_REGS_PARM2_CORE_SYSCALL(regs) > > +#define ___bpf_syscall_args3(x, args...) ___bpf_syscall_args2(args), (void *)PT_REGS_PARM3_CORE_SYSCALL(regs) > > +#define ___bpf_syscall_args4(x, args...) ___bpf_syscall_args3(args), (void *)PT_REGS_PARM4_CORE_SYSCALL(regs) > > +#define ___bpf_syscall_args5(x, args...) ___bpf_syscall_args4(args), (void *)PT_REGS_PARM5_CORE_SYSCALL(regs) > > +#define ___bpf_syscall_args(args...) ___bpf_apply(___bpf_syscall_args, ___bpf_narg(args))(args) > > + > > +/* > > + * BPF_KPROBE_SYSCALL is a variant of BPF_KPROBE, which is intended for > > + * tracing syscall functions, like __x64_sys_close. It hides the underlying > > + * platform-specific low-level way of getting syscall input arguments from > > + * struct pt_regs, and provides a familiar typed and named function arguments > > + * syntax and semantics of accessing syscall input parameters. > > + * > > + * Original struct pt_regs* context is preserved as 'ctx' argument. This might > > + * be necessary when using BPF helpers like bpf_perf_event_output(). > > + */ > > LGTM. Please also mention that this macro relies on CO-RE so that > users are aware. > Now that Ilya's fixes are in again, added a small note about reliance on BPF CO-RE and pushed to bpf-next, thanks. On a relevant note. The whole __x64_sys_close vs sys_close depending on architecture and kernel version was always super annoying. BCC makes this transparent to users (AFAIK) and it always bothered me a little, but I didn't see a clean solution that fits libbpf. I think I finally found it, though. Instead of guessing whether the kprobe function is a syscall or not based on "sys_" prefix of a kernel function, we can use libbpf SEC() handling to do this transparently. What if we define two new SEC() definitions: SEC("ksyscall/write") and SEC("kretsyscall/write") (or maybe SEC("kprobe.syscall/write") and SEC("kretprobe.syscall/write"), not sure which one is better, voice your opinion, please). And for such special kprobes, libbpf will perform feature detection of this ARCH_SYSCALL_WRAPPER (we'll need to see the best way to do this in a simple and fast way, preferably without parsing kallsyms) and depending on it substitute either sys_write (or should it be __se_sys_write, according to Naveen) or __<arch>_sys_write. You get the idea. I like that this is still explicit and in the spirit of libbpf, but offloads the burden of knowing these intricate differences from users. Thoughts? > Unfortunately we had to back out Ilya's patches with > PT_REGS_SYSCALL_REGS() and PT_REGS_PARMx_CORE_SYSCALL(), so we'll need > to wait a bit before merging this. > > > > +#define BPF_KPROBE_SYSCALL(name, args...) \ > > +name(struct pt_regs *ctx); \ > > +static __attribute__((always_inline)) typeof(name(0)) \ > > +____##name(struct pt_regs *ctx, ##args); \ > > +typeof(name(0)) name(struct pt_regs *ctx) \ > > +{ \ > > + struct pt_regs *regs = PT_REGS_SYSCALL_REGS(ctx); \ > > + _Pragma("GCC diagnostic push") \ > > + _Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \ > > + return ____##name(___bpf_syscall_args(args)); \ > > + _Pragma("GCC diagnostic pop") \ > > +} \ > > +static __attribute__((always_inline)) typeof(name(0)) \ > > +____##name(struct pt_regs *ctx, ##args) > > + > > #endif > > -- > > 2.30.2