On Fri, Apr 21, 2023 at 4:17 AM Yauheni Kaliuta <ykaliuta@xxxxxxxxxx> wrote: > > Hi, Alexei! > > >>>>> On Thu, 20 Apr 2023 16:12:49 -0700, Alexei Starovoitov wrote: > > On Thu, Apr 20, 2023 at 2:40 PM Yauheni Kaliuta <ykaliuta@xxxxxxxxxx> wrote: > >> >>>>> On Thu, 20 Apr 2023 13:54:26 -0700, Alexei Starovoitov wrote: > >> > On Thu, Apr 20, 2023 at 1:37 PM Yauheni Kaliuta <ykaliuta@xxxxxxxxxx> wrote: > >> >> >>>>> On Thu, 20 Apr 2023 08:59:09 -0700, Alexei Starovoitov wrote: > >> >> >> > >> >> >> Should perf_call_bpf_enter/exit (kernel/trace/trace_syscalls.c) > >> >> >> use struct trace_event_raw_sys_enter/exit instead of locally > >> >> >> crafted struct syscall_tp_t nowadays? > >> >> > >> >> > >> >> > No. It needs syscall_tp_t. > >> >> > >> >> > test_progs's vmlinux test > >> >> >> expects it as the context. > >> >> >> > >> >> > >> >> > what do you mean? Pls share a code pointer? > >> >> > >> >> https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/progs/test_vmlinux.c#L19 > >> >> > >> >> SEC("tp/syscalls/sys_enter_nanosleep") > >> >> int handle__tp(struct trace_event_raw_sys_enter *args) > >> > >> > I see. That bit is correct and that's what bpftrace is doing > >> > when attaching to syscalls. > >> > What do you see in your patched RT kernel when you do: > >> > cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_nanosleep/format > >> > ? > >> > Depending on the answer we might need to fix > >> > the kernel side that has to use struct trace_entry > >> > in syscall_tp_t instead of plain long long. > >> > >> # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_nanosleep/format > >> name: sys_enter_nanosleep > >> ID: 374 > >> format: > >> field:unsigned short common_type; offset:0; size:2; signed:0; > >> field:unsigned char common_flags; offset:2; size:1; signed:0; > >> field:unsigned char common_preempt_count; offset:3; size:1; signed:0; > >> field:int common_pid; offset:4; size:4; signed:1; > >> field:unsigned char common_preempt_lazy_count; offset:8; size:1; signed:0; > >> > >> field:int __syscall_nr; offset:12; size:4; signed:1; > >> field:struct __kernel_timespec * rqtp; offset:16; size:8; signed:0; > >> field:struct __kernel_timespec * rmtp; offset:24; size:8; signed:0; > >> > >> print fmt: "rqtp: 0x%08lx, rmtp: 0x%08lx", ((unsigned long)(REC->rqtp)), ((unsigned long)(REC->rmtp)) > > > > Lol. > > Jiri even fixed the issue with this format in bpftrace 3 years ago: > > https://github.com/iovisor/bpftrace/commit/a2e3d5dbc03ceb49b776cf5602d31896158844a7 > > Hehe :) > > > Let's fix the kernel side too. Something like this should do it: > > > diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c > > index 942ddbdace4a..7aa1f4299486 100644 > > --- a/kernel/trace/trace_syscalls.c > > +++ b/kernel/trace/trace_syscalls.c > > @@ -555,7 +555,7 @@ static int perf_call_bpf_enter(struct > > trace_event_call *call, struct pt_regs *re > > struct syscall_trace_enter *rec) > > { > > struct syscall_tp_t { > > - unsigned long long regs; > > + struct trace_entry ent; > > unsigned long syscall_nr; > > unsigned long args[SYSCALL_DEFINE_MAXARGS]; > > } param; > > @@ -657,7 +657,7 @@ static int perf_call_bpf_exit(struct > > trace_event_call *call, struct pt_regs *reg > > struct syscall_trace_exit *rec) > > { > > struct syscall_tp_t { > > - unsigned long long regs; > > + struct trace_entry ent; > > > > pls add build_bug_on that sizeof(ent) >= sizeof(void*). > > Ok. Should the line *(struct pt_regs **)¶m = regs; be commented somehow? commented out? No. It's mandatory. And the reason for build_bug_on existence... to make sure that there is enough space there.