On 2024/7/27 08:12, Andrii Nakryiko wrote: > On Thu, Jul 25, 2024 at 7:57 PM Leon Hwang <hffilwlqm@xxxxxxxxx> wrote: >> >> >> >> On 26/7/24 05:27, Andrii Nakryiko wrote: >>> On Thu, Jul 25, 2024 at 12:33 AM Leon Hwang <hffilwlqm@xxxxxxxxx> wrote: >>>> >>>> >>>> >>>> On 25/7/24 14:09, Yonghong Song wrote: >>>>> >>>>> On 7/24/24 11:05 PM, Leon Hwang wrote: >>>>>> >>>>>> On 25/7/24 13:54, Yonghong Song wrote: >>>>>>> On 7/24/24 10:15 PM, Zheao Li wrote: >>>>>>>> This is a v2 patch, previous Link: >>>>>>>> https://lore.kernel.org/bpf/20240724152521.20546-1-me@xxxxxxxxxxxx/T/#u >>>>>>>> >> >> [SNI] >> > > [...] > >>> >> >> Build and run, sudo ./retsnoop -e verbose -e bpf_log -e >> bpf_verifier_vlog -e bpf_verifier_log_write -STA -v, here's the output: >> >> >> FUNCTION CALLS RESULT DURATION ARGS >> -------------- ------ -------- ---- >> ↔ bpf_log [void] 1.350us log=NULL fmt='%s() is not a global >> function ' =(vararg) >> >> It's great to show arguments. >> > > Thanks for repro steps, they worked. Also, I just pushed latest > retsnoop version to Github that does support capturing vararg > arguments for printf-like functions. See full debugging log at [0], > but I basically did just two things: > > $ sudo retsnoop -e '*sys_bpf' --lbr -n freplace > > -n freplace filters by process name, to avoid the noise. I traced > bpf() syscall (*sys_bf), and I requested function call LBR (Last > Branch Record) stack. LBR showed that we have > bpf_prog_attach_check_attach_type() call, and then eventually we get > to bpf_log(). > > So I then traced bpf_log (no --lbr this time, but I requested function > trace + arguments capture: > > $ sudo retsnoop -n freplace -e '*sys_bpf' -a bpf_log -TA > > 17:02:39.968302 -> 17:02:39.968307 TID/PID 2730863/2730855 (freplace/freplace): > > FUNCTION CALLS RESULT DURATION ARGS > ----------------- --------- -------- ---- > → __x64_sys_bpf > regs=&{.r15=2,.r14=0xc0000061c0,.bp=0xc00169f8a8,.bx=28,.r11=514,.ax=0xffffffffffffffda,.cx=0x404f4e,.dx=64,.si=0xc00169fa10… > → __sys_bpf cmd=28 > uattr={{.kernel=0xc00169fa10,.user=0xc00169fa10}} size=64 > ↔ bpf_log [void] 1.550us log=NULL fmt='%s() is not a > global function ' vararg0='stub_handler_static' > ← __sys_bpf [-EINVAL] 4.115us > ← __x64_sys_bpf [-EINVAL] 5.467us > > > For __x64_sys_bpf that's struct pt_regs, which isn't that interesting, > but then we have: > > ↔ bpf_log [void] 1.550us log=NULL fmt='%s() is not a global > function ' vararg0='stub_handler_static' It's awesome to show vararg. > > Which showed format string and the argument passed to it: > 'stub_hanler_static' subprogram seems to be the problem here. > > > Anyways, tbh, for a problem like this, it's probably best to just > request a verbose log when doing the BPF_PROG_LOAD command. You can > *normally* use veristat tool to get that easily, if you have a .bpf.o > object file on the disk. But in this case it's freplace and veristat > doesn't know what's the target BPF program, so it's not that useful in > this case: > > $ sudo veristat -v freplace_bpfel.o > Processing 'freplace_bpfel.o'... > libbpf: prog 'freplace_handler': attach program FD is not set > libbpf: prog 'freplace_handler': failed to prepare load attributes: -22 > libbpf: prog 'freplace_handler': failed to load: -22 > libbpf: failed to load object 'freplace_bpfel.o' > PROCESSING freplace_bpfel.o/freplace_handler, DURATION US: 0, VERDICT: > failure, VERIFIER LOG: > > File Program Verdict Duration (us) Insns > States Peak states > ---------------- ---------------- ------- ------------- ----- > ------ ----------- > freplace_bpfel.o freplace_handler failure 0 0 > 0 0 > ---------------- ---------------- ------- ------------- ----- > ------ ----------- > Done. Processed 1 files, 0 programs. Skipped 1 files, 0 programs. > > But for lots of other programs this would be a no-brainer. > > > [0] https://gist.github.com/anakryiko/88a1597a68e43dc945e40fde88a96e7e > > [...] > >> >> Is it OK to add a tracepoint here? I think tracepoint is more generic >> than retsnoop-like way. > > I personally don't see a problem with adding tracepoint, but how would > it look like, given we are talking about vararg printf-style function > calls? I'm not sure how that should be represented in such a way as to > make it compatible with tracepoints and not cause any runtime > overhead. The tracepoint is not about vararg printf-style function calls. It is to trace the reason why it fails to bpf_check_attach_target() at attach time. So, let me introduce bpf_check_attach_target_with_tracepoint() and BPF_LOG_KERNEL_WITHOUT_PRINT. bpf_check_attach_target_with_tracepoint() is to call bpf_check_attach_target() and then to call trace_bpf_check_attach_target() if err. BPF_LOG_KERNEL_WITHOUT_PRINT is to avoid pr_err() in bpf_verifier_vlog(). Here's the diff without trace_bpf_check_attach_target() definition: diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index bfd093ac333f2..717e4deda2998 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -629,6 +629,7 @@ struct bpf_verifier_log { #define BPF_LOG_LEVEL (BPF_LOG_LEVEL1 | BPF_LOG_LEVEL2) #define BPF_LOG_MASK (BPF_LOG_LEVEL | BPF_LOG_STATS | BPF_LOG_FIXED) #define BPF_LOG_KERNEL (BPF_LOG_MASK + 1) /* kernel internal flag */ +#define BPF_LOG_KERNEL_WITHOUT_PRINT (BPF_LOG_KERNEL << 1) #define BPF_LOG_MIN_ALIGNMENT 8U #define BPF_LOG_ALIGNMENT 40U @@ -853,6 +854,11 @@ int bpf_check_attach_target(struct bpf_verifier_log *log, const struct bpf_prog *tgt_prog, u32 btf_id, struct bpf_attach_target_info *tgt_info); +int bpf_check_attach_target_with_tracepoint(const struct bpf_prog *prog, + const struct bpf_prog *tgt_prog, + u32 btf_id, + struct bpf_attach_target_info *tgt_info); + void bpf_free_kfunc_btf_tab(struct bpf_kfunc_btf_tab *tab); int mark_chain_precision(struct bpf_verifier_env *env, int regno); diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c index 5aebfc3051e3a..2bdcdd2fc320f 100644 --- a/kernel/bpf/log.c +++ b/kernel/bpf/log.c @@ -65,8 +65,10 @@ void bpf_verifier_vlog(struct bpf_verifier_log *log, const char *fmt, n = vscnprintf(log->kbuf, BPF_VERIFIER_TMP_LOG_SIZE, fmt, args); - if (log->level == BPF_LOG_KERNEL) { + if (log->level & BPF_LOG_KERNEL) { bool newline = n > 0 && log->kbuf[n - 1] == '\n'; + if (log->level & BPF_LOG_KERNEL_WITHOUT_PRINT) + return; pr_err("BPF: %s%s", log->kbuf, newline ? "" : "\n"); return; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 869265852d515..4d293a8da5a9b 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -3464,8 +3464,8 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog, */ struct bpf_attach_target_info tgt_info = {}; - err = bpf_check_attach_target(NULL, prog, tgt_prog, btf_id, - &tgt_info); + err = bpf_check_attach_target_with_tracepoint(prog, tgt_prog, + btf_id, &tgt_info); if (err) goto out_unlock; diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c index f8302a5ca400d..44b9f95a07e9c 100644 --- a/kernel/bpf/trampoline.c +++ b/kernel/bpf/trampoline.c @@ -699,9 +699,9 @@ int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog, u64 key; int err; - err = bpf_check_attach_target(NULL, prog, NULL, - prog->aux->attach_btf_id, - &tgt_info); + err = bpf_check_attach_target_with_tracepoint(prog, NULL, + prog->aux->attach_btf_id, + &tgt_info); if (err) return err; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 1f5302fb09570..acd9b1b96c76c 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -21929,6 +21929,28 @@ int bpf_check_attach_target(struct bpf_verifier_log *log, return 0; } +int bpf_check_attach_target_with_tracepoint(const struct bpf_prog *prog, + const struct bpf_prog *tgt_prog, + u32 btf_id, + struct bpf_attach_target_info *tgt_info); +{ + struct bpf_verifier_log *log; + int err; + + log = kzalloc(sizeof(*log), GFP_KERNEL); + if (!log) { + err = -ENOMEM; + return err; + } + + log->level = BPF_LOG_KERNEL | BPF_LOG_KERNEL_WITHOUT_PRINT; + err = bpf_check_attach_target(log, prog, tgt_prog, btf_id, tgt_info); + if (err) + trace_bpf_check_attach_target(log->kbuf); + kfree(log); + return err; +} + BTF_SET_START(btf_id_deny) BTF_ID_UNUSED #ifdef CONFIG_SMP