Re: [PATCH -trace] trace: fix TASK_COMM_LEN in trace event format file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 10, 2023 at 10:13 AM Kajetan Puchalski
<kajetan.puchalski@xxxxxxx> wrote:
>
> Hi Yafang,
>
> > After commit 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN"),
> > the content of the format file under
> > /sys/kernel/debug/tracing/events/task/task_newtask was changed from
> >   field:char comm[16];    offset:12;    size:16;    signed:0;
> > to
> >   field:char comm[TASK_COMM_LEN];    offset:12;    size:16;    signed:0;
> >
> > John reported that this change breaks older versions of perfetto.
> > Then Mathieu pointed out that this behavioral change was caused by the
> > use of __stringify(_len), which happens to work on macros, but not on enum
> > labels. And he also gave the suggestion on how to fix it:
> >   :One possible solution to make this more robust would be to extend
> >   :struct trace_event_fields with one more field that indicates the length
> >   :of an array as an actual integer, without storing it in its stringified
> >   :form in the type, and do the formatting in f_show where it belongs.
> >
> > The result as follows after this change,
> > $ cat /sys/kernel/debug/tracing/events/task/task_newtask/format
> >         field:char comm[16];    offset:12;      size:16;        signed:0;
> >
> > Fixes: 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN")
> > Reported-by: John Stultz <jstultz@xxxxxxxxxx>
> > Debugged-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> > Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>
> > Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> > Cc: Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx>
> > Cc: Kajetan Puchalski <kajetan.puchalski@xxxxxxx>
> > Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
> > Cc: John Stultz <jstultz@xxxxxxxxxx>
> > Cc: stable@xxxxxxxxxxxxxxx # v5.17+
>
> From my testing this works the same with older Perfetto as the previous
> diff you sent in the older thread, ie this type of errors:
>
> Name    Value   Type
> mismatched_sched_switch_tids    2439    error (analysis)
> systrace_parse_failure  3853    error (analysis)
>
> Meaning that applying this patch on top of the old one ends up with
> different results than just reverting the old one so not a 100% fix
> but as I said before, still an improvement.

fwiw I don't mind reverting commit 3087c61ed2c4.
For the record it didn't go through the bpf tree.
It went through mm.

But first we need to define which part of ftrace format is an abi.
I think it's a format of tracing/event/foo/format file
and not the contents of it.
The tracepoints come and go. Their arguments get changed too.
So the contents of ftrace format files change.

In this case Perfetto stumbled on parsing
field:char comm[TASK_COMM_LEN];    offset:8;

We probably should define that 'field: value  offset: value'
is an abi, but value-s can change.
Say offset:8 becomes offset:+8, for whatever reason.
If Perfetto fails to parse it, it's a Perfetto bug.

In this case it failed to parse char comm[TASK_COMM_LEN];
But that's not the only such "field:".
These three were there for a long time.
    field:u32 rates[NUM_NL80211_BANDS];    offset:16;    size:24;
    field:int mcast_rate[NUM_NL80211_BANDS];    offset:60;    size:24;
    field:int mcast_rate[NUM_NL80211_BANDS];    offset:108;    size:24;

I suspect Perfetto didn't have a use case to parse them,
so the bug stayed dormant and a change in TASK_COMM_LEN triggered it.

We can use Yafang's patch to do:
-field:%.*s %s%s;
+field:%.*s %s[%d];
but it will affect both TASK_COMM_LEN and NUM_NL80211_BANDS.

In summary the TASK_COMM_LEN change from #define to enum didn't
introduce anything new in the kind of "value"s being printed
in ftrace files. Hence I'm arguing there is no abi breakage.

There was a question why change from #define to enum is useful
to bpf. The reason is that #define-s are not seen in dwarf
whereas enums and their values are. bpf tooling has ways to extract
that data and auto-adjust bpf programs when enum-s disappear
or their values change.



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux