Andrii Nakryiko wrote:
On Fri, Feb 4, 2022 at 8:46 AM Naveen N. Rao
<naveen.n.rao@xxxxxxxxxxxxxxxxxx> wrote:
Ilya Leoshkevich wrote:
> Some architectures pass a pointer to struct pt_regs to syscall
> handlers, others unpack it into individual function parameters.
I think that is just dependent on ARCH_HAS_SYSCALL_WRAPPER, so only x86,
arm64 and s390 pass pointers to pt_regs to syscall entry points.
> Introduce a macro to describe what a particular arch does, using
> `passing pt_regs *` as a default.
>
> Signed-off-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx>
> ---
> tools/lib/bpf/bpf_tracing.h | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
> index 30f0964f8c9e..08d2990c006f 100644
> --- a/tools/lib/bpf/bpf_tracing.h
> +++ b/tools/lib/bpf/bpf_tracing.h
> @@ -334,6 +334,15 @@ struct pt_regs;
>
> #endif /* defined(bpf_target_defined) */
>
> +/*
> + * When invoked from a syscall handler kprobe, returns a pointer to a
> + * struct pt_regs containing syscall arguments and suitable for passing to
> + * PT_REGS_PARMn_SYSCALL() and PT_REGS_PARMn_CORE_SYSCALL().
> + */
> +#ifndef PT_REGS_SYSCALL_REGS
> +#define PT_REGS_SYSCALL_REGS(ctx) ((struct pt_regs *)PT_REGS_PARM1(ctx))
> +#endif
> +
I think that name is misleading if an architecture doesn't implement syscall
wrappers, since you are simply getting access to the kprobe pt_regs, rather
than the syscall pt_regs. This can perhaps be named PT_REGS_SYSCALL_UNWRAP() or
such to make that clear.
UNWRAP implies that there is something to unwrap, always. In case of
s390x, for example, there is nothing to unwrap. So I think
PT_REGS_SYSCALL_REGS() makes more sense, it just fetches correct
pt_regs to work with to get syscall input arguments (and it might be
exactly the same pt_regs that are passed in).
I think in practice most users won't ever have to use this, as we'll
add BPF_KPROBE_SYSCALL() macro, similar to BPF_KPROBE that we have
now, but specific to syscall kprobe.
That will be very nice.
Also, should this just be keyed off a simpler HAS_SYSCALL_WRAPPER or such,
rather than the other way around?
I think the way Ilya did it is totally fine.
diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index 032ba809f3e57a..c72f285578d3fc 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -110,6 +110,8 @@
#endif /* __i386__ */
+#define HAS_SYSCALL_WRAPPER
+
#endif /* __KERNEL__ || __VMLINUX_H__ */
#elif defined(bpf_target_s390)
@@ -126,6 +128,7 @@
#define __PT_RC_REG gprs[2]
#define __PT_SP_REG gprs[15]
#define __PT_IP_REG psw.addr
+#define HAS_SYSCALL_WRAPPER
#elif defined(bpf_target_arm)
@@ -154,6 +157,7 @@
#define __PT_RC_REG regs[0]
#define __PT_SP_REG sp
#define __PT_IP_REG pc
+#define HAS_SYSCALL_WRAPPER
#elif defined(bpf_target_mips)
We can then simply do:
#ifdef HAS_SYSCALL_WRAPPER
#define PT_REGS_SYSCALL_UNWRAP(ctx) ((struct pt_regs *)PT_REGS_PARM1(ctx))
#else
#define PT_REGS_SYSCALL_unwRAP(ctx) ((struct pt_regs *)(ctx))
#endif
Taking this a bit further, it would be nice if we can fold in progs/bpf_misc.h
into bpf_traching.h by also including SYS_PREFIX.
As far as I know, SYS_PREFIX depends not just on architecture but also
on kernel version (older versions of x86-64 kernels didn't need that
prefix). For selftests, given they follow the latest version of kernel
it's ok to always append SYS_PREFIX, but generally speaking for user
BPF apps, they would need to be more careful and check whether they
need SYS_PREFIX or not. So I don't want to add SYS_PREFIX to
bpf_tracing.h because it's misleading.
That makes sense, thanks.
- Naveen