On 25/7/24 14:09, Yonghong Song wrote: > > On 7/24/24 11:05 PM, Leon Hwang wrote: >> >> On 25/7/24 13:54, Yonghong Song wrote: >>> On 7/24/24 10:15 PM, Zheao Li wrote: >>>> This is a v2 patch, previous Link: >>>> https://lore.kernel.org/bpf/20240724152521.20546-1-me@xxxxxxxxxxxx/T/#u >>>> >>>> Compare with v1: >>>> >>>> 1. Format the code style and signed-off field >>>> 2. Use a shorter name bpf_check_attach_target_with_klog instead of >>>> original name bpf_check_attach_target_with_kernel_log >>>> >>>> When attaching a freplace hook, failures can occur, >>>> but currently, no output is provided to help developers diagnose the >>>> root cause. >>>> >>>> This commit adds a new method, bpf_check_attach_target_with_klog, >>>> which outputs the verifier log to the kernel. >>>> Developers can then use dmesg to obtain more detailed information >>>> about the failure. >>>> >>>> For an example of eBPF code, >>>> Link: >>>> https://github.com/Asphaltt/learn-by-example/blob/main/ebpf/freplace/main.go >>>> >>>> Co-developed-by: Leon Hwang <hffilwlqm@xxxxxxxxx> >>>> Signed-off-by: Leon Hwang <hffilwlqm@xxxxxxxxx> >>>> Signed-off-by: Zheao Li <me@xxxxxxxxxxxx> >>>> --- >>>> include/linux/bpf_verifier.h | 5 +++++ >>>> kernel/bpf/syscall.c | 5 +++-- >>>> kernel/bpf/trampoline.c | 6 +++--- >>>> kernel/bpf/verifier.c | 19 +++++++++++++++++++ >>>> 4 files changed, 30 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/include/linux/bpf_verifier.h >>>> b/include/linux/bpf_verifier.h >>>> index 5cea15c81b8a..8eddba62c194 100644 >>>> --- a/include/linux/bpf_verifier.h >>>> +++ b/include/linux/bpf_verifier.h >>>> @@ -848,6 +848,11 @@ static inline void bpf_trampoline_unpack_key(u64 >>>> key, u32 *obj_id, u32 *btf_id) >>>> *btf_id = key & 0x7FFFFFFF; >>>> } >>>> +int bpf_check_attach_target_with_klog(const struct bpf_prog *prog, >>>> + const struct bpf_prog *tgt_prog, >>>> + u32 btf_id, >>>> + struct bpf_attach_target_info *tgt_info); >>> format issue in the above. Same code alignment is needed for arguments >>> in different lines. >>> >>>> + >>>> int bpf_check_attach_target(struct bpf_verifier_log *log, >>>> const struct bpf_prog *prog, >>>> const struct bpf_prog *tgt_prog, >>>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c >>>> index 869265852d51..bf826fcc8cf4 100644 >>>> --- a/kernel/bpf/syscall.c >>>> +++ b/kernel/bpf/syscall.c >>>> @@ -3464,8 +3464,9 @@ static int bpf_tracing_prog_attach(struct >>>> bpf_prog *prog, >>>> */ >>>> struct bpf_attach_target_info tgt_info = {}; >>>> - err = bpf_check_attach_target(NULL, prog, tgt_prog, btf_id, >>>> - &tgt_info); >>>> + err = bpf_check_attach_target_with_klog(prog, NULL, >>>> + prog->aux->attach_btf_id, >>>> + &tgt_info); >>> code alignment issue here as well. >>> Also, the argument should be 'prog, tgt_prog, btf_id, &tgt_info', right? >>> >>>> if (err) >>>> goto out_unlock; >>>> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c >>>> index f8302a5ca400..8862adaa7302 100644 >>>> --- a/kernel/bpf/trampoline.c >>>> +++ b/kernel/bpf/trampoline.c >>>> @@ -699,9 +699,9 @@ int bpf_trampoline_link_cgroup_shim(struct >>>> bpf_prog *prog, >>>> u64 key; >>>> int err; >>>> - err = bpf_check_attach_target(NULL, prog, NULL, >>>> - prog->aux->attach_btf_id, >>>> - &tgt_info); >>>> + err = bpf_check_attach_target_with_klog(prog, NULL, >>>> + prog->aux->attach_btf_id, >>>> + &tgt_info); >>> code alignment issue here >>> >>>> if (err) >>>> return err; >>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >>>> index 1f5302fb0957..4873b72f5a9a 100644 >>>> --- a/kernel/bpf/verifier.c >>>> +++ b/kernel/bpf/verifier.c >>>> @@ -21643,6 +21643,25 @@ static int >>>> check_non_sleepable_error_inject(u32 btf_id) >>>> return btf_id_set_contains(&btf_non_sleepable_error_inject, >>>> btf_id); >>>> } >>>> +int bpf_check_attach_target_with_klog(const struct bpf_prog *prog, >>>> + const struct bpf_prog *tgt_prog, >>>> + u32 btf_id, >>>> + struct bpf_attach_target_info *tgt_info); >>> code alignment issue here. >>> >>>> +{ >>>> + struct bpf_verifier_log *log; >>>> + int err; >>>> + >>>> + log = kzalloc(sizeof(*log), GFP_KERNEL | __GFP_NOWARN); >>> __GFP_NOWARN is unnecessary here. >>> >>>> + if (!log) { >>>> + err = -ENOMEM; >>>> + return err; >>>> + } >>>> + log->level = BPF_LOG_KERNEL; >>>> + err = bpf_check_attach_target(log, prog, tgt_prog, btf_id, >>>> tgt_info); >>>> + kfree(log); >>>> + return err; >>>> +} >>>> + >>>> int bpf_check_attach_target(struct bpf_verifier_log *log, >>>> const struct bpf_prog *prog, >>>> const struct bpf_prog *tgt_prog, >>> More importantly, Andrii has implemented retsnoop, which intends to >>> locate >>> precise location in the kernel where err happens. The link is >>> https://github.com/anakryiko/retsnoop >>> >>> Maybe you want to take a look and see whether it can resolve your issue. >>> We should really avoid putting more stuff in dmesg whenever possible. >>> >> retsnoop is really cool. >> >> However, when something wrong in bpf_check_attach_target(), retsnoop >> only gets its return value -EINVAL, without any bpf_log() in it. It's >> hard to figure out the reason why bpf_check_attach_target() returns >> -EINVAL. > > It should have line number like below in > https://github.com/anakryiko/retsnoop > > |$ sudo ./retsnoop -e '*sys_bpf' -a ':kernel/bpf/*.c' Receiving data... > 20:19:36.372607 -> 20:19:36.372682 TID/PID 8346/8346 (simfail/simfail): > entry_SYSCALL_64_after_hwframe+0x63 (arch/x86/entry/entry_64.S:120:0) > do_syscall_64+0x35 (arch/x86/entry/common.c:80:7) . do_syscall_x64 > (arch/x86/entry/common.c:50:12) 73us [-ENOMEM] __x64_sys_bpf+0x1a > (kernel/bpf/syscall.c:5067:1) 70us [-ENOMEM] __sys_bpf+0x38b > (kernel/bpf/syscall.c:4947:9) . map_create (kernel/bpf/syscall.c:1106:8) > . find_and_alloc_map (kernel/bpf/syscall.c:132:5) ! 50us [-ENOMEM] > array_map_alloc !* 2us [NULL] bpf_map_alloc_percpu Could you double > check? It does need corresponding kernel source though. | > I have a try on an Ubuntu 24.04 VM, whose kernel is 6.8.0-39-generic. $ sudo retsnoop -e '*sys_bpf' -a ':kernel/bpf/*.c' -T Receiving data... 07:18:38.643643 -> 07:18:38.643728 TID/PID 6042/6039 (freplace/freplace): FUNCTION CALL TRACE RESULT DURATION --------------------------------------------------- -------------------- -------- → __x64_sys_bpf → __sys_bpf ↔ bpf_check_uarg_tail_zero [0] 2.376us → link_create ↔ __bpf_prog_get [0xffffb55f40db3000] 2.796us ↔ bpf_prog_attach_check_attach_type [0] 2.260us → bpf_tracing_prog_attach ↔ __bpf_prog_get [0xffffb55f40d71000] 9.455us → bpf_check_attach_target → btf_check_type_match → btf_check_func_type_match ↔ bpf_log [void] 2.578us ← btf_check_func_type_match [-EINVAL] 7.659us ← btf_check_type_match [-EINVAL] 15.950us ← bpf_check_attach_target [-EINVAL] 22.397us ↔ __bpf_prog_put [void] 2.323us ← bpf_tracing_prog_attach [-EINVAL] 45.509us ↔ __bpf_prog_put [void] 2.182us ← link_create [-EINVAL] 66.445us ← __sys_bpf [-EINVAL] 77.347us ← __x64_sys_bpf [-EINVAL] 81.979us entry_SYSCALL_64_after_hwframe+0x78 (arch/x86/entry/entry_64.S:130:0) do_syscall_64+0x7f (arch/x86/entry/common.c:83:7) . do_syscall_x64 (arch/x86/entry/common.c:52:12) x64_sys_call+0x1936 (arch/x86/entry/syscall_64.c:33:1) 81us [-EINVAL] __x64_sys_bpf+0x1a (kernel/bpf/syscall.c:5588:1) 77us [-EINVAL] __sys_bpf+0x4ae (kernel/bpf/syscall.c:5556:9) ! 66us [-EINVAL] link_create !* 45us [-EINVAL] bpf_tracing_prog_attach !* 22us [-EINVAL] bpf_check_attach_target !* 15us [-EINVAL] btf_check_type_match !* 7us [-EINVAL] btf_check_func_type_match P.S. Check https://gist.github.com/Asphaltt/883fd7362968f7747e820d63a9519971 to have a better view of this output. When attach freplace prog to a static-noline subprog, there is a bpf_log() in btf_check_func_type_match(). However, I don't know what bpf_log() logs. With this patch, we are able to figure out what bpf_log() logs. Therefore, we are able to figure out the reason why it fails to attach freplace prog. Thanks, Leon