On Mon, 2023-01-16 at 23:37 +0100, Ilya Leoshkevich wrote: > On Mon, 2023-01-16 at 23:09 +0100, Ilya Leoshkevich wrote: > > On Fri, 2023-01-13 at 00:33 -0800, Andrii Nakryiko wrote: > > > This patch set fixes and extends libbpf's bpf_tracing.h support > > > for > > > tracing > > > arguments of kprobes/uprobes, and syscall as a special case. > > > > > > Depending on the architecture, anywhere between 3 and 8 arguments > > > can > > > be > > > passed to a function in registers (so relevant to kprobes and > > > uprobes), but > > > before this patch set libbpf's macros in bpf_tracing.h only > > > supported > > > up to > > > 5 arguments, which is limiting in practice. This patch set > > > extends > > > bpf_tracing.h to support up to 8 arguments, if architecture > > > allows. > > > This > > > includes explicit PT_REGS_PARMx() macro family, as well as > > > BPF_KPROBE() macro. > > > > > > Now, with tracing syscall arguments situation is sometimes quite > > > different. > > > For a lot of architectures syscall argument passing through > > > registers > > > differs > > > from function call sequence at least a little. For i386 it > > > differs > > > *a > > > lot*. > > > This patch set addresses this issue across all currently > > > supported > > > architectures and hopefully fixes existing issues. syscall(2) > > > manpage > > > defines > > > that either 6 or 7 arguments can be supported, depending on > > > architecture, so > > > libbpf defines 6 or 7 registers per architecture to be used to > > > fetch > > > syscall > > > arguments. > > > > > > Also, BPF_UPROBE and BPF_URETPROBE are introduced as part of this > > > patch set. > > > They are aliases for BPF_KPROBE and BPF_KRETPROBE (as mechanics > > > of > > > argument > > > fetching of kernel functions and user-space functions are > > > identical), > > > but it > > > allows BPF users to have less confusing BPF-side code when > > > working > > > with > > > uprobes. > > > > > > For both sets of changes selftests are extended to test these new > > > register > > > definitions to architecture-defined limits. Unfortunately I don't > > > have ability > > > to test it on all architectures, and BPF CI only tests 3 > > > architecture > > > (x86-64, > > > arm64, and s390x), so it would be greatly appreciated if CC'ed > > > people > > > can help > > > review and test changes on architectures they are familiar with > > > (and > > > maybe > > > have direct access to for testing). Thank you. > > > > > > Cc: Alan Maguire <alan.maguire@xxxxxxxxxx> > > > Cc: Ilya Leoshkevich <iii@xxxxxxxxxxxxx> > > > Cc: Pu Lehui <pulehui@xxxxxxxxxx> > > > Cc: Hengqi Chen <hengqi.chen@xxxxxxxxx> > > > Cc: Vladimir Isaev <isaev@xxxxxxxxxxxx> > > > Cc: Björn Töpel <bjorn@xxxxxxxxxx> > > > Cc: Kenta Tada <Kenta.Tada@xxxxxxxx> > > > Cc: Florent Revest <revest@xxxxxxxxxxxx> > > > > > > Andrii Nakryiko (25): > > > libbpf: add support for fetching up to 8 arguments in kprobes > > > libbpf: add 6th argument support for x86-64 in bpf_tracing.h > > > libbpf: fix arm and arm64 specs in bpf_tracing.h > > > libbpf: complete mips spec in bpf_tracing.h > > > libbpf: complete powerpc spec in bpf_tracing.h > > > libbpf: complete sparc spec in bpf_tracing.h > > > libbpf: complete riscv arch spec in bpf_tracing.h > > > libbpf: fix and complete ARC spec in bpf_tracing.h > > > libbpf: complete LoongArch (loongarch) spec in bpf_tracing.h > > > libbpf: add BPF_UPROBE and BPF_URETPROBE macro aliases > > > selftests/bpf: validate arch-specific argument registers limits > > > libbpf: improve syscall tracing support in bpf_tracing.h > > > libbpf: define x86-64 syscall regs spec in bpf_tracing.h > > > libbpf: define i386 syscall regs spec in bpf_tracing.h > > > libbpf: define s390x syscall regs spec in bpf_tracing.h > > > libbpf: define arm syscall regs spec in bpf_tracing.h > > > libbpf: define arm64 syscall regs spec in bpf_tracing.h > > > libbpf: define mips syscall regs spec in bpf_tracing.h > > > libbpf: define powerpc syscall regs spec in bpf_tracing.h > > > libbpf: define sparc syscall regs spec in bpf_tracing.h > > > libbpf: define riscv syscall regs spec in bpf_tracing.h > > > libbpf: define arc syscall regs spec in bpf_tracing.h > > > libbpf: define loongarch syscall regs spec in bpf_tracing.h > > > selftests/bpf: add 6-argument syscall tracing test > > > libbpf: clean up now not needed __PT_PARM{1-6}_SYSCALL_REG > > > defaults > > > > > > tools/lib/bpf/bpf_tracing.h | 301 > > > +++++++++++++++- > > > -- > > > .../bpf/prog_tests/test_bpf_syscall_macro.c | 18 +- > > > .../bpf/prog_tests/uprobe_autoattach.c | 33 +- > > > tools/testing/selftests/bpf/progs/bpf_misc.h | 25 ++ > > > .../selftests/bpf/progs/bpf_syscall_macro.c | 26 ++ > > > .../bpf/progs/test_uprobe_autoattach.c | 48 ++- > > > 6 files changed, 405 insertions(+), 46 deletions(-) > > > > > > > With the following fixup for 24/25: [...] > > Tested-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx> # s390x > > While the above fixup works, I realized that it's ugly. It's better > to > admit that mmap and old_mmap are different syscalls and create a > different probe, even if it means duplicating pid filtering code: [...] Sorry, I'm being dense. Both fixups defeat the purpose of having this test, because they don't use all 6 register arguments. We need to choose a different syscall; I believe splice() fits the bill. The other alternatives that I rejected were: - clone() - argument order is messy; - recvfrom() - s390x uses socketcall instead; - ipc() - doesn't seem to be available on aarch64. The following worked for me: diff --git a/tools/testing/selftests/bpf/prog_tests/test_bpf_syscall_macro.c b/tools/testing/selftests/bpf/prog_tests/test_bpf_syscall_macro.c index e18dd82eb801..2900c5e9a016 100644 --- a/tools/testing/selftests/bpf/prog_tests/test_bpf_syscall_macro.c +++ b/tools/testing/selftests/bpf/prog_tests/test_bpf_syscall_macro.c @@ -1,20 +1,22 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright 2022 Sony Group Corporation */ +#define _GNU_SOURCE +#include <fcntl.h> #include <sys/prctl.h> -#include <sys/mman.h> #include <test_progs.h> #include "bpf_syscall_macro.skel.h" void test_bpf_syscall_macro(void) { struct bpf_syscall_macro *skel = NULL; - int err, page_size = getpagesize(); + int err; int exp_arg1 = 1001; unsigned long exp_arg2 = 12; unsigned long exp_arg3 = 13; unsigned long exp_arg4 = 14; unsigned long exp_arg5 = 15; - void *r; + loff_t off_in, off_out; + ssize_t r; /* check whether it can open program */ skel = bpf_syscall_macro__open(); @@ -71,18 +73,17 @@ void test_bpf_syscall_macro(void) ASSERT_EQ(skel->bss->arg4_syscall, exp_arg4, "BPF_KPROBE_SYSCALL_arg4"); ASSERT_EQ(skel->bss->arg5_syscall, exp_arg5, "BPF_KPROBE_SYSCALL_arg5"); - r = mmap((void *)0x12340000, 3 * page_size, PROT_READ | PROT_WRITE, - MAP_PRIVATE, -42, 5 * page_size); + r = splice(-42, &off_in, 42, &off_out, 0x12340000, SPLICE_F_NONBLOCK); err = -errno; - ASSERT_EQ(r, MAP_FAILED, "mmap_res"); - ASSERT_EQ(err, -EBADF, "mmap_err"); + ASSERT_EQ(r, -1, "splice_res"); + ASSERT_EQ(err, -EBADF, "splice_err"); - ASSERT_EQ(skel->bss->mmap_addr, 0x12340000, "mmap_arg1"); - ASSERT_EQ(skel->bss->mmap_length, 3 * page_size, "mmap_arg2"); - ASSERT_EQ(skel->bss->mmap_prot, PROT_READ | PROT_WRITE, "mmap_arg3"); - ASSERT_EQ(skel->bss->mmap_flags, MAP_PRIVATE, "mmap_arg4"); - ASSERT_EQ(skel->bss->mmap_fd, -42, "mmap_arg5"); - ASSERT_EQ(skel->bss->mmap_offset, 5 * page_size, "mmap_arg6"); + ASSERT_EQ(skel->bss->splice_fd_in, -42, "splice_arg1"); + ASSERT_EQ(skel->bss->splice_off_in, (__u64)&off_in, "splice_arg2"); + ASSERT_EQ(skel->bss->splice_fd_out, 42, "splice_arg3"); + ASSERT_EQ(skel->bss->splice_off_out, (__u64)&off_out, "splice_arg4"); + ASSERT_EQ(skel->bss->splice_len, 0x12340000, "splice_arg5"); + ASSERT_EQ(skel->bss->splice_flags, SPLICE_F_NONBLOCK, "splice_arg6"); cleanup: bpf_syscall_macro__destroy(skel); diff --git a/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c b/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c index c07c5c52d5fc..1a476d8ed354 100644 --- a/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c +++ b/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c @@ -81,28 +81,28 @@ int BPF_KSYSCALL(prctl_enter, int option, unsigned long arg2, return 0; } -__u64 mmap_addr; -__u64 mmap_length; -__u64 mmap_prot; -__u64 mmap_flags; -__u64 mmap_fd; -__u64 mmap_offset; - -SEC("ksyscall/mmap") -int BPF_KSYSCALL(mmap_enter, void *addr, size_t length, int prot, int flags, - int fd, off_t offset) +__u64 splice_fd_in; +__u64 splice_off_in; +__u64 splice_fd_out; +__u64 splice_off_out; +__u64 splice_len; +__u64 splice_flags; + +SEC("ksyscall/splice") +int BPF_KSYSCALL(splice_enter, int fd_in, loff_t *off_in, int fd_out, + loff_t *off_out, size_t len, unsigned int flags) { pid_t pid = bpf_get_current_pid_tgid() >> 32; if (pid != filter_pid) return 0; - mmap_addr = (__u64)addr; - mmap_length = length; - mmap_prot = prot; - mmap_flags = flags; - mmap_fd = fd; - mmap_offset = offset; + splice_fd_in = fd_in; + splice_off_in = (__u64)off_in; + splice_fd_out = fd_out; + splice_off_out = (__u64)off_out; + splice_len = len; + splice_flags = flags; return 0; } -- 2.39.0 Best regards, Ilya