On Thu, Jul 25, 2024 at 7:57 PM Leon Hwang <hffilwlqm@xxxxxxxxx> wrote: > > > > On 26/7/24 05:27, Andrii Nakryiko wrote: > > On Thu, Jul 25, 2024 at 12:33 AM Leon Hwang <hffilwlqm@xxxxxxxxx> wrote: > >> > >> > >> > >> On 25/7/24 14:09, Yonghong Song wrote: > >>> > >>> On 7/24/24 11:05 PM, Leon Hwang wrote: > >>>> > >>>> On 25/7/24 13:54, Yonghong Song wrote: > >>>>> On 7/24/24 10:15 PM, Zheao Li wrote: > >>>>>> This is a v2 patch, previous Link: > >>>>>> https://lore.kernel.org/bpf/20240724152521.20546-1-me@xxxxxxxxxxxx/T/#u > >>>>>> > > [SNI] > [...] > > > > Build and run, sudo ./retsnoop -e verbose -e bpf_log -e > bpf_verifier_vlog -e bpf_verifier_log_write -STA -v, here's the output: > > > FUNCTION CALLS RESULT DURATION ARGS > -------------- ------ -------- ---- > ↔ bpf_log [void] 1.350us log=NULL fmt='%s() is not a global > function ' =(vararg) > > It's great to show arguments. > Thanks for repro steps, they worked. Also, I just pushed latest retsnoop version to Github that does support capturing vararg arguments for printf-like functions. See full debugging log at [0], but I basically did just two things: $ sudo retsnoop -e '*sys_bpf' --lbr -n freplace -n freplace filters by process name, to avoid the noise. I traced bpf() syscall (*sys_bf), and I requested function call LBR (Last Branch Record) stack. LBR showed that we have bpf_prog_attach_check_attach_type() call, and then eventually we get to bpf_log(). So I then traced bpf_log (no --lbr this time, but I requested function trace + arguments capture: $ sudo retsnoop -n freplace -e '*sys_bpf' -a bpf_log -TA 17:02:39.968302 -> 17:02:39.968307 TID/PID 2730863/2730855 (freplace/freplace): FUNCTION CALLS RESULT DURATION ARGS ----------------- --------- -------- ---- → __x64_sys_bpf regs=&{.r15=2,.r14=0xc0000061c0,.bp=0xc00169f8a8,.bx=28,.r11=514,.ax=0xffffffffffffffda,.cx=0x404f4e,.dx=64,.si=0xc00169fa10… → __sys_bpf cmd=28 uattr={{.kernel=0xc00169fa10,.user=0xc00169fa10}} size=64 ↔ bpf_log [void] 1.550us log=NULL fmt='%s() is not a global function ' vararg0='stub_handler_static' ← __sys_bpf [-EINVAL] 4.115us ← __x64_sys_bpf [-EINVAL] 5.467us For __x64_sys_bpf that's struct pt_regs, which isn't that interesting, but then we have: ↔ bpf_log [void] 1.550us log=NULL fmt='%s() is not a global function ' vararg0='stub_handler_static' Which showed format string and the argument passed to it: 'stub_hanler_static' subprogram seems to be the problem here. Anyways, tbh, for a problem like this, it's probably best to just request a verbose log when doing the BPF_PROG_LOAD command. You can *normally* use veristat tool to get that easily, if you have a .bpf.o object file on the disk. But in this case it's freplace and veristat doesn't know what's the target BPF program, so it's not that useful in this case: $ sudo veristat -v freplace_bpfel.o Processing 'freplace_bpfel.o'... libbpf: prog 'freplace_handler': attach program FD is not set libbpf: prog 'freplace_handler': failed to prepare load attributes: -22 libbpf: prog 'freplace_handler': failed to load: -22 libbpf: failed to load object 'freplace_bpfel.o' PROCESSING freplace_bpfel.o/freplace_handler, DURATION US: 0, VERDICT: failure, VERIFIER LOG: File Program Verdict Duration (us) Insns States Peak states ---------------- ---------------- ------- ------------- ----- ------ ----------- freplace_bpfel.o freplace_handler failure 0 0 0 0 ---------------- ---------------- ------- ------------- ----- ------ ----------- Done. Processed 1 files, 0 programs. Skipped 1 files, 0 programs. But for lots of other programs this would be a no-brainer. [0] https://gist.github.com/anakryiko/88a1597a68e43dc945e40fde88a96e7e [...] > > Is it OK to add a tracepoint here? I think tracepoint is more generic > than retsnoop-like way. I personally don't see a problem with adding tracepoint, but how would it look like, given we are talking about vararg printf-style function calls? I'm not sure how that should be represented in such a way as to make it compatible with tracepoints and not cause any runtime overhead. > > Thanks, > Leon > >