On Thu, Apr 20, 2023 at 2:40 PM Yauheni Kaliuta <ykaliuta@xxxxxxxxxx> wrote: > > Hi, Alexei! > > >>>>> On Thu, 20 Apr 2023 13:54:26 -0700, Alexei Starovoitov wrote: > > On Thu, Apr 20, 2023 at 1:37 PM Yauheni Kaliuta <ykaliuta@xxxxxxxxxx> wrote: > >> >>>>> On Thu, 20 Apr 2023 08:59:09 -0700, Alexei Starovoitov wrote: > >> >> > >> >> Should perf_call_bpf_enter/exit (kernel/trace/trace_syscalls.c) > >> >> use struct trace_event_raw_sys_enter/exit instead of locally > >> >> crafted struct syscall_tp_t nowadays? > >> > >> > >> > No. It needs syscall_tp_t. > >> > >> > test_progs's vmlinux test > >> >> expects it as the context. > >> >> > >> > >> > what do you mean? Pls share a code pointer? > >> > >> https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/progs/test_vmlinux.c#L19 > >> > >> SEC("tp/syscalls/sys_enter_nanosleep") > >> int handle__tp(struct trace_event_raw_sys_enter *args) > > > I see. That bit is correct and that's what bpftrace is doing > > when attaching to syscalls. > > What do you see in your patched RT kernel when you do: > > cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_nanosleep/format > > ? > > Depending on the answer we might need to fix > > the kernel side that has to use struct trace_entry > > in syscall_tp_t instead of plain long long. > > # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_nanosleep/format > name: sys_enter_nanosleep > ID: 374 > format: > field:unsigned short common_type; offset:0; size:2; signed:0; > field:unsigned char common_flags; offset:2; size:1; signed:0; > field:unsigned char common_preempt_count; offset:3; size:1; signed:0; > field:int common_pid; offset:4; size:4; signed:1; > field:unsigned char common_preempt_lazy_count; offset:8; size:1; signed:0; > > field:int __syscall_nr; offset:12; size:4; signed:1; > field:struct __kernel_timespec * rqtp; offset:16; size:8; signed:0; > field:struct __kernel_timespec * rmtp; offset:24; size:8; signed:0; > > print fmt: "rqtp: 0x%08lx, rmtp: 0x%08lx", ((unsigned long)(REC->rqtp)), ((unsigned long)(REC->rmtp)) Lol. Jiri even fixed the issue with this format in bpftrace 3 years ago: https://github.com/iovisor/bpftrace/commit/a2e3d5dbc03ceb49b776cf5602d31896158844a7 Let's fix the kernel side too. Something like this should do it: diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c index 942ddbdace4a..7aa1f4299486 100644 --- a/kernel/trace/trace_syscalls.c +++ b/kernel/trace/trace_syscalls.c @@ -555,7 +555,7 @@ static int perf_call_bpf_enter(struct trace_event_call *call, struct pt_regs *re struct syscall_trace_enter *rec) { struct syscall_tp_t { - unsigned long long regs; + struct trace_entry ent; unsigned long syscall_nr; unsigned long args[SYSCALL_DEFINE_MAXARGS]; } param; @@ -657,7 +657,7 @@ static int perf_call_bpf_exit(struct trace_event_call *call, struct pt_regs *reg struct syscall_trace_exit *rec) { struct syscall_tp_t { - unsigned long long regs; + struct trace_entry ent; pls add build_bug_on that sizeof(ent) >= sizeof(void*).