Re: [PATCH bpf-next v2] selftests/bpf: emit top frequent code lines in veristat

Mykyta Yatsenko <mykyta.yatsenko5@xxxxxxxxx> · Thu, 19 Sep 2024 13:08:46 +0100

On 19/09/2024 03:51, Philo Lu wrote:

Hi Mykyta,

On 2024/9/19 04:39, Mykyta Yatsenko wrote:
From: Mykyta Yatsenko <yatsenko@xxxxxxxx>

Production BPF programs are increasing in number of instructions and 
states
to the point, where optimising verification process for them is 
necessary
to avoid running into instruction limit. Authors of those BPF programs
need to analyze verifier output, for example, collecting the most
frequent source code lines to understand which part of the program has
the biggest verification cost.

This patch introduces `--top-src-lines` flag in veristat.
`--top-src-lines=N` makes veristat output N the most popular sorce code
lines, parsed from verification log.

An example:
```
$ sudo ./veristat --log-size=1000000000 --top-src-lines=4 
pyperf600.bpf.o
Processing 'pyperf600.bpf.o'...
Top source lines (on_event):
  4697: (pyperf.h:0)
  2334: (pyperf.h:326)    event->stack[i] = *symbol_id;
  2334: (pyperf.h:118)    pidData->offsets.String_data);
  1176: (pyperf.h:92) bpf_probe_read_user(&frame->f_back,
...
```

I think this is useful and wonder how can I use it. In particular, is 
it possible to know the corresponding instruction number contributed 
by the source lines?

No, as far as I know, we don't have that info, so we just use number of 
source lines as a proxy for number of instructions. Eduard suggested to 
collect
instruction count per source line in verifier, maybe that actually what 
we should do.
Assume a prog is rejected due to instruction limit. I can optimize the 
prog with `--top-src-lines`, but have to check the result with another 
"load" to see the total instruction number (because I don't know how 
many instructions reduced with the optimized src lines).

Am I right? or is there any better method?
Yes, you are right.

Thanks.