>> Looks like macros only available for x86_64. Can we make it also >> available for other architectures so we won't introduce arch specific >> codes into bpf program? > >Yeah, but instead of copy/pasting it for each architecture, let's define PT_REGS_PARM4/PT_REGS_PARM4_CORE for x86-64 (is this the only arch with such inconsistency?) and then after all the architectures defined their macro define Currently, I think x86_64 is the only arch which causes this issue. But I'll add #ifndef to not only the 4th parameter but also all parameters for future extensibility. >> >> Also, could you add a selftest to use this macro, esp. for parameter 4? Sure. I'll add the same macro to bpf_tracing.h in selftests. Thanks! -----Original Message----- From: Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> Sent: Wednesday, December 22, 2021 7:58 AM To: Yonghong Song <yhs@xxxxxx> Cc: Tada, Kenta (SGC) <Kenta.Tada@xxxxxxxx>; Andrii Nakryiko <andrii@xxxxxxxxxx>; bpf <bpf@xxxxxxxxxxxxxxx>; Alexei Starovoitov <ast@xxxxxxxxxx>; Daniel Borkmann <daniel@xxxxxxxxxxxxx>; Martin Lau <kafai@xxxxxx>; Song Liu <songliubraving@xxxxxx>; john fastabend <john.fastabend@xxxxxxxxx>; KP Singh <kpsingh@xxxxxxxxxx> Subject: Re: [PATCH] libbpf: Fix the incorrect register read for syscalls on x86_64 On Tue, Dec 21, 2021 at 7:51 AM Yonghong Song <yhs@xxxxxx> wrote: > > > > On 12/21/21 3:21 AM, Kenta.Tada@xxxxxxxx wrote: > > Currently, rcx is read as the fourth parameter of syscall on x86_64. > > But x86_64 Linux System Call convention uses r10 actually. > > This commit adds the wrapper for users who want to access to syscall > > params to analyze the user space. > > > > Signed-off-by: Kenta Tada <Kenta.Tada@xxxxxxxx> > > --- > > tools/lib/bpf/bpf_tracing.h | 20 ++++++++++++++++++++ > > 1 file changed, 20 insertions(+) > > > > diff --git a/tools/lib/bpf/bpf_tracing.h > > b/tools/lib/bpf/bpf_tracing.h index db05a5937105..f6fcccd9b10c > > 100644 > > --- a/tools/lib/bpf/bpf_tracing.h > > +++ b/tools/lib/bpf/bpf_tracing.h > > @@ -67,10 +67,15 @@ > > #if defined(__KERNEL__) || defined(__VMLINUX_H__) > > > > #define PT_REGS_PARM1(x) ((x)->di) > > +#define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1(x) > > #define PT_REGS_PARM2(x) ((x)->si) > > +#define PT_REGS_PARM2_SYSCALL(x) PT_REGS_PARM2(x) > > #define PT_REGS_PARM3(x) ((x)->dx) > > +#define PT_REGS_PARM3_SYSCALL(x) PT_REGS_PARM3(x) > > #define PT_REGS_PARM4(x) ((x)->cx) > > +#define PT_REGS_PARM4_SYSCALL(x) ((x)->r10) /* syscall uses r10 */ > > I think this is correct. We have a bcc commit doing similar thing. > https://urldefense.com/v3/__https://github.com/iovisor/bcc/commit/c234 > 48e34ecd3cc9bfc19f0b43f4325f77c2e4cc*diff-c78ffb58f59e85eaba9bf9977b72 > 02f3e50f17e2a9ee556c36a311f9a9ab5d6e__;Iw!!JmoZiZGBv3RvKRSx!qa2pOQlUSd > WfoxmZ7EuPvZpWAlK5IBVP1eC-2d3njBLe3yehAXyV_0wI9mGRF5Q$ [github[.]com] > > > #define PT_REGS_PARM5(x) ((x)->r8) > > +#define PT_REGS_PARM5_SYSCALL(x) PT_REGS_PARM5(x) > > #define PT_REGS_RET(x) ((x)->sp) > > #define PT_REGS_FP(x) ((x)->bp) > > #define PT_REGS_RC(x) ((x)->ax) > > @@ -78,10 +83,15 @@ > > #define PT_REGS_IP(x) ((x)->ip) > > > > #define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), di) > > +#define PT_REGS_PARM1_CORE_SYSCALL(x) PT_REGS_PARM1_CORE(x) > > #define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), si) > > +#define PT_REGS_PARM2_CORE_SYSCALL(x) PT_REGS_PARM2_CORE(x) > > #define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), dx) > > +#define PT_REGS_PARM3_CORE_SYSCALL(x) PT_REGS_PARM3_CORE(x) > > #define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), cx) > > +#define PT_REGS_PARM4_CORE_SYSCALL(x) BPF_CORE_READ((x), r10) /* > > +syscall uses r10 */ > > #define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), r8) > > +#define PT_REGS_PARM5_CORE_SYSCALL(x) PT_REGS_PARM5_CORE(x) > > #define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), sp) > > #define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), bp) > > #define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), ax) @@ -117,10 > > +127,15 @@ > > #else > > > > #define PT_REGS_PARM1(x) ((x)->rdi) > > +#define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1(x) > > #define PT_REGS_PARM2(x) ((x)->rsi) > > +#define PT_REGS_PARM2_SYSCALL(x) PT_REGS_PARM2(x) > > #define PT_REGS_PARM3(x) ((x)->rdx) > > +#define PT_REGS_PARM3_SYSCALL(x) PT_REGS_PARM3(x) > > #define PT_REGS_PARM4(x) ((x)->rcx) > > +#define PT_REGS_PARM4_SYSCALL(x) ((x)->r10) /* syscall uses r10 */ > > #define PT_REGS_PARM5(x) ((x)->r8) > > +#define PT_REGS_PARM5(x) PT_REGS_PARM5(x) > > #define PT_REGS_RET(x) ((x)->rsp) > > #define PT_REGS_FP(x) ((x)->rbp) > > #define PT_REGS_RC(x) ((x)->rax) > > @@ -128,10 +143,15 @@ > > #define PT_REGS_IP(x) ((x)->rip) > > > > #define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), rdi) > > +#define PT_REGS_PARM1_CORE_SYSCALL(x) PT_REGS_PARM1_CORE(x) > > #define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), rsi) > > +#define PT_REGS_PARM2_CORE_SYSCALL(x) PT_REGS_PARM2_CORE(x) > > #define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), rdx) > > +#define PT_REGS_PARM3_CORE_SYSCALL(x) PT_REGS_PARM3_CORE(x) > > #define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), rcx) > > +#define PT_REGS_PARM4_CORE_SYSCALL(x) BPF_CORE_READ((x), r10) /* > > +syscall uses r10 */ > > #define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), r8) > > +#define PT_REGS_PARM5_CORE_SYSCALL(x) PT_REGS_PARM5_CORE(x) > > #define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), rsp) > > #define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), rbp) > > #define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), rax) > > Looks like macros only available for x86_64. Can we make it also > available for other architectures so we won't introduce arch specific > codes into bpf program? Yeah, but instead of copy/pasting it for each architecture, let's define PT_REGS_PARM4/PT_REGS_PARM4_CORE for x86-64 (is this the only arch with such inconsistency?) and then after all the architectures defined their macro define #define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1(x) ... #ifndef PT_REGS_PARM4_SYSCALL(x) #define PT_REGS_PARM4_SYSCALL(x) PT_REGS_PARM4(x) #endif That way we'll avoid all the extra "no-op" definitions. > > Also, could you add a selftest to use this macro, esp. for parameter 4? +1