On Mon, Aug 19, 2024 at 01:47:20PM +0200, Jiri Olsa wrote: > On Fri, Aug 16, 2024 at 03:30:40PM -0400, Steven Rostedt wrote: > > On Fri, 16 Aug 2024 20:59:47 +0200 > > Jiri Olsa <olsajiri@xxxxxxxxx> wrote: > > > > > so far the only working solution I have is adding '__nullable' suffix > > > to argument name: > > > > > > diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h > > > index 9ea4c404bd4e..fc46f0b42741 100644 > > > --- a/include/trace/events/sched.h > > > +++ b/include/trace/events/sched.h > > > @@ -559,9 +559,9 @@ DEFINE_EVENT(sched_stat_runtime, sched_stat_runtime, > > > */ > > > TRACE_EVENT(sched_pi_setprio, > > > > > > - TP_PROTO(struct task_struct *tsk, struct task_struct *pi_task), > > > + TP_PROTO(struct task_struct *tsk, struct task_struct *pi_task__nullable), > > > > > > - TP_ARGS(tsk, pi_task), > > > + TP_ARGS(tsk, pi_task__nullable), > > > > > > TP_STRUCT__entry( > > > __array( char, comm, TASK_COMM_LEN ) > > > @@ -574,8 +574,8 @@ TRACE_EVENT(sched_pi_setprio, > > > memcpy(__entry->comm, tsk->comm, TASK_COMM_LEN); > > > __entry->pid = tsk->pid; > > > __entry->oldprio = tsk->prio; > > > - __entry->newprio = pi_task ? > > > - min(tsk->normal_prio, pi_task->prio) : > > > + __entry->newprio = pi_task__nullable ? > > > + min(tsk->normal_prio, pi_task__nullable->prio) : > > > tsk->normal_prio; > > > /* XXX SCHED_DEADLINE bits missing */ > > > ), > > > > > > > > > now I'm trying to make work something like: > > > > > > diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h > > > index 9ea4c404bd4e..4e4aae2d5700 100644 > > > --- a/include/trace/events/sched.h > > > +++ b/include/trace/events/sched.h > > > @@ -559,9 +559,9 @@ DEFINE_EVENT(sched_stat_runtime, sched_stat_runtime, > > > */ > > > TRACE_EVENT(sched_pi_setprio, > > > > > > - TP_PROTO(struct task_struct *tsk, struct task_struct *pi_task), > > > + TP_PROTO(struct task_struct *tsk, struct task_struct *__nullable(pi_task)), > > > > > > - TP_ARGS(tsk, pi_task), > > > + TP_ARGS(tsk, __nullable(pi_task)), > > > > > > TP_STRUCT__entry( > > > __array( char, comm, TASK_COMM_LEN ) > > > > Hmm, that's really ugly though. Both versions. > > > > Now when Alexei said: > > > > > > > > > We cannot make all tracepoint pointers to be PTR_TRUSTED | PTR_MAYBE_NULL > > > > > > > by default, since it will break a bunch of progs. > > > > > > > Instead we can annotate this tracepoint arg as __nullable and > > > > > > > teach the verifier to recognize such special arguments of tracepoints. > > > > I'm not familiar with the verifier, so I don't know how the above is > > implemented, and why it would break a bunch of progs. > > verifier assumes that programs attached to the tracepoint can access > pointer arguments without checking them for null and some of those > programs most likely access such arguments directly > > changing that globally and require bpf program to do null check for all > pointer arguments will make verifier fail to load existing programs > > > > > If you had a macro around the parameter: > > > > TP_PROTO(struct task_struct *tsk, struct task_struct *__nullable(pi_task)), > > > > Could having that go through another macro pass in trace_events.h work? > > That is, could we associate the trace event with "nullable" parameters > > that could be stored someplace else for you? > > IIUC you mean to store extra data for each tracepoint that would > annotate the argument? as Alexei pointed out earlier it might be > too much, because we'd be fine with just adding suffix to annotated > arguments in __bpf_trace_##call: > > __bpf_trace_##call(void *__data, proto) \ nah.. it's defined for class template, so tracepoints like cgroup_mkdir won't have its own __bpf_trace_cgroup_mkdir function that we could use we need to do something else jirka > { \ > CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(__data, CAST_TO_U64(args)); \ > } > > with that verifier could easily get suffix information from BTF and > once gcc implements btf_type_tag we can easily switch to that > > jirka