On Fri, Feb 9, 2024 at 4:33 PM Andrii Nakryiko <andrii@xxxxxxxxxx> wrote: > > As BPF applications grow in size and complexity and are separated into > multiple .bpf.c files that are statically linked together, it becomes > harder and harder to match verifier's BPF assembly level output to > original C code. While often annotated C source code is unique enough to > be able to identify the file it belongs to, quite often this is actually > problematic as parts of source code can be quite generic. > > Long story short, it is very useful to see source code file name and > line number information along with the original C code. Verifier already > knows this information, we just need to output it. > > This patch set is an initial proposal on how this can be done. No new > flags are added and file:line information is appended at the end of > C code: > > ; <original C code> (<filename>.bpf.c:<line>) > > If file name has directory names in it, they are stripped away. This > should be fine in practice as file names tend to be pretty unique with > C code anyways, and keeping log size smaller is always good. > > In practice this might look something like below, where some code is > coming from application files, while others are from libbpf's usdt.bpf.h > header file: > > ; if (STROBEMETA_READ( (strobemeta_probe.bpf.c:534) > 5592: (79) r1 = *(u64 *)(r10 -56) ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680) > 5593: (7b) *(u64 *)(r10 -56) = r1 ; R1_w=mem_or_null(id=1589,sz=7680) R10=fp0 fp-56_w=mem_or_null(id=1589,sz=7680) > 5594: (79) r3 = *(u64 *)(r10 -8) ; R3_w=scalar() R10=fp0 fp-8=mmmmmmmm > > ... > > 170: (71) r1 = *(u8 *)(r8 +15) ; frame1: R1_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) R8_w=map_value(map=__bpf_usdt_spec,ks=4,vs=208) > 171: (67) r1 <<= 56 ; frame1: R1_w=scalar(smax=0x7f00000000000000,umax=0xff00000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xff00000000000000)) > 172: (c7) r1 s>>= 56 ; frame1: R1_w=scalar(smin=smin32=-128,smax=smax32=127) > ; val <<= arg_spec->arg_bitshift; (usdt.bpf.h:183) > 173: (67) r1 <<= 32 ; frame1: R1_w=scalar(smax=0x7f00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000)) > 174: (77) r1 >>= 32 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) > 175: (79) r2 = *(u64 *)(r10 -8) ; frame1: R2_w=scalar() R10=fp0 fp-8=mmmmmmmm > 176: (6f) r2 <<= r1 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R2_w=scalar() > 177: (7b) *(u64 *)(r10 -8) = r2 ; frame1: R2_w=scalar(id=61) R10=fp0 fp-8_w=scalar(id=61) > ; if (arg_spec->arg_signed) (usdt.bpf.h:184) > 178: (bf) r3 = r2 ; frame1: R2_w=scalar(id=61) R3_w=scalar(id=61) > 179: (7f) r3 >>= r1 ; frame1: R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R3_w=scalar() > ; if (arg_spec->arg_signed) (usdt.bpf.h:184) > 180: (71) r4 = *(u8 *)(r8 +14) > 181: safe > > I've played with few different formats and none stood out as > particularly better than other. Suggestions and votes are appreciated: > > a) ; if (arg_spec->arg_signed) (usdt.bpf.h:184) > b) ; if (arg_spec->arg_signed) [usdt.bpf.h:184] > c) ; [usdt.bpf.h:184] if (arg_spec->arg_signed) > d) ; (usdt.bpf.h:184) if (arg_spec->arg_signed) Great idea. I would drop parenthesis. Don't see a reason to wrap a text. Since we already use ';' as a comment I would continue: e) ; if (arg_spec->arg_signed) ; usdt.bpf.h:184 Note that that fragile test needs to be adjusted again: Error: #137/3 log_fixup/bad_core_relo_trunc_full pw-bot: cr