Re: [PATCH bpf-next v2] bpf: Add bpf_check_attach_target_with_klog method to output failure logs to kernel

Leon Hwang <hffilwlqm@xxxxxxxxx> · Fri, 26 Jul 2024 10:57:26 +0800

On 26/7/24 05:27, Andrii Nakryiko wrote:
> On Thu, Jul 25, 2024 at 12:33 AM Leon Hwang <hffilwlqm@xxxxxxxxx> wrote:
>>
>>
>>
>> On 25/7/24 14:09, Yonghong Song wrote:
>>>
>>> On 7/24/24 11:05 PM, Leon Hwang wrote:
>>>>
>>>> On 25/7/24 13:54, Yonghong Song wrote:
>>>>> On 7/24/24 10:15 PM, Zheao Li wrote:
>>>>>> This is a v2 patch, previous Link:
>>>>>> https://lore.kernel.org/bpf/20240724152521.20546-1-me@xxxxxxxxxxxx/T/#u
>>>>>>

[SNI]

>>>>> More importantly, Andrii has implemented retsnoop, which intends to
>>>>> locate
>>>>> precise location in the kernel where err happens. The link is
>>>>>    https://github.com/anakryiko/retsnoop
>>>>>
>>>>> Maybe you want to take a look and see whether it can resolve your issue.
>>>>> We should really avoid putting more stuff in dmesg whenever possible.
>>>>>
>>>> retsnoop is really cool.
>>>>
>>>> However, when something wrong in bpf_check_attach_target(), retsnoop
>>>> only gets its return value -EINVAL, without any bpf_log() in it. It's
>>>> hard to figure out the reason why bpf_check_attach_target() returns
>>>> -EINVAL.
>>>
>>> It should have line number like below in
>>> https://github.com/anakryiko/retsnoop
>>>
>>> |$ sudo ./retsnoop -e '*sys_bpf' -a ':kernel/bpf/*.c' Receiving data...
>>> 20:19:36.372607 -> 20:19:36.372682 TID/PID 8346/8346 (simfail/simfail):
>>> entry_SYSCALL_64_after_hwframe+0x63 (arch/x86/entry/entry_64.S:120:0)
>>> do_syscall_64+0x35 (arch/x86/entry/common.c:80:7) . do_syscall_x64
>>> (arch/x86/entry/common.c:50:12) 73us [-ENOMEM] __x64_sys_bpf+0x1a
>>> (kernel/bpf/syscall.c:5067:1) 70us [-ENOMEM] __sys_bpf+0x38b
>>> (kernel/bpf/syscall.c:4947:9) . map_create (kernel/bpf/syscall.c:1106:8)
>>> . find_and_alloc_map (kernel/bpf/syscall.c:132:5) ! 50us [-ENOMEM]
>>> array_map_alloc !* 2us [NULL] bpf_map_alloc_percpu Could you double
>>> check? It does need corresponding kernel source though. |
>>>
>>
>> I have a try on an Ubuntu 24.04 VM, whose kernel is 6.8.0-39-generic.
>>
>> $ sudo retsnoop -e '*sys_bpf' -a ':kernel/bpf/*.c' -T
>> Receiving data...
>> 07:18:38.643643 -> 07:18:38.643728 TID/PID 6042/6039 (freplace/freplace):
>>
>> FUNCTION CALL TRACE                                   RESULT
>>    DURATION
>> ---------------------------------------------------
>> --------------------  --------
>> → __x64_sys_bpf
>>
>>     → __sys_bpf
>>
>>         ↔ bpf_check_uarg_tail_zero                    [0]
>>     2.376us
>>         → link_create
>>
>>             ↔ __bpf_prog_get
>> [0xffffb55f40db3000]   2.796us
>>             ↔ bpf_prog_attach_check_attach_type       [0]
>>     2.260us
>>             → bpf_tracing_prog_attach
>>
>>                 ↔ __bpf_prog_get
>> [0xffffb55f40d71000]   9.455us
>>                 → bpf_check_attach_target
>>
>>                     → btf_check_type_match
>>
>>                         → btf_check_func_type_match
>>
>>                             ↔ bpf_log                 [void]
>>     2.578us
>>                         ← btf_check_func_type_match   [-EINVAL]
>>     7.659us
>>                     ← btf_check_type_match            [-EINVAL]
>>    15.950us
>>                 ← bpf_check_attach_target             [-EINVAL]
>>    22.397us
>>                 ↔ __bpf_prog_put                      [void]
>>     2.323us
>>             ← bpf_tracing_prog_attach                 [-EINVAL]
>>    45.509us
>>             ↔ __bpf_prog_put                          [void]
>>     2.182us
>>         ← link_create                                 [-EINVAL]
>>    66.445us
>>     ← __sys_bpf                                       [-EINVAL]
>>    77.347us
>> ← __x64_sys_bpf                                       [-EINVAL]
>>    81.979us
>>
>>                     entry_SYSCALL_64_after_hwframe+0x78
>> (arch/x86/entry/entry_64.S:130:0)
>>                     do_syscall_64+0x7f
>> (arch/x86/entry/common.c:83:7)
>>                     . do_syscall_x64
>> (arch/x86/entry/common.c:52:12)
>>                     x64_sys_call+0x1936
>> (arch/x86/entry/syscall_64.c:33:1)
>>     81us [-EINVAL]  __x64_sys_bpf+0x1a
>> (kernel/bpf/syscall.c:5588:1)
>>     77us [-EINVAL]  __sys_bpf+0x4ae
>> (kernel/bpf/syscall.c:5556:9)
>> !   66us [-EINVAL]  link_create
>>
>> !*  45us [-EINVAL]  bpf_tracing_prog_attach
>>
>> !*  22us [-EINVAL]  bpf_check_attach_target
>>
>> !*  15us [-EINVAL]  btf_check_type_match
>>
>> !*   7us [-EINVAL]  btf_check_func_type_match
>>
>> P.S. Check
>> https://gist.github.com/Asphaltt/883fd7362968f7747e820d63a9519971 to
>> have a better view of this output.
>>
>> When attach freplace prog to a static-noline subprog, there is a
>> bpf_log() in btf_check_func_type_match(). However, I don't know what
>> bpf_log() logs.
> 
> If you build the very latest retsnoop from Github (this functionality
> hasn't been released just yet, I added it literally two days ago), you
> will be able to capture bpf_log's format string. vararg arguments
> themselves are not (yet) captured, but I'm going to play with that.
> 
> Try something like this:
> 
> sudo ./retsnoop -e verbose -e bpf_log -e bpf_verifier_vlog -e
> bpf_verifier_log_write -STA -v
> 
> And you might see something like:
> 
> FUNCTION CALLS   RESULT  DURATION  ARGS
> --------------   ------  --------  ----
> ↔ bpf_log        [void]   2.555us  log=&{} fmt='func '%s' arg%d has
> btf_id %d type %s '%s' ' =(vararg)
> 
> or
> 
> FUNCTION CALLS   RESULT  DURATION  ARGS
> --------------   ------  --------  ----
> ↔ bpf_log        [void]   5.729us  log=&{} fmt='arg#%d reference
> type('%s %s') size cannot be determined: %ld ' =(vararg)
> 
> 
> So you'll get a general understanding from format string (but yeah,
> actual arguments would be good to have).
> 

Build and run, sudo ./retsnoop -e verbose -e bpf_log -e
bpf_verifier_vlog -e bpf_verifier_log_write -STA -v, here's the output:

FUNCTION CALLS   RESULT  DURATION  ARGS
--------------   ------  --------  ----
↔ bpf_log        [void]   1.350us  log=NULL fmt='%s() is not a global
function ' =(vararg)

It's great to show arguments.

> 
> This is not really a solution, but definitely useful for debugging.

I try another way with following bpf code:

#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>

SEC("kprobe/bpf_log")
int k_bpf_log(struct pt_regs *ctx)
{
    const char *arg2 = (const char *) PT_REGS_PARM2(ctx);
    void *arg3 = (void *) PT_REGS_PARM3(ctx);
    void *arg4 = (void *) PT_REGS_PARM4(ctx);
    void *arg5 = (void *) PT_REGS_PARM5(ctx);
    char buf[30];

    bpf_probe_read_kernel_str(buf, 30, arg2);
    bpf_trace_printk(buf, 30, arg3, arg4, arg5);

    return BPF_OK;
}

It is able to print the log message:
<...>-6212    [003] ...21  3585.117121: bpf_trace_printk:
stub_handler_static() is not a global function

But it's not generic enough to inspect the log.

> 
> Is there some simple way for me to reproduce your specific issue, I'd
> like to use that as one motivating example to see how far retsnoop can
> be pushed.

Sure, here's my example to reproduce the issue:
https://github.com/Asphaltt/learn-by-example/tree/main/ebpf/freplace,
which relies on Go, clang and llvm-strip.

Reproduce the issue by:

git clone https://github.com/Asphaltt/learn-by-example.git
cd learn-by-example/ebpf/freplace
go generate
go build
./freplace --freplace-failed

Then, it will output something like:
2024/07/26 01:45:43 Failed to freplace(stub_handler_static): create
link: invalid argument

> 
> P.S. I do think that putting any logging like this into dmesg is
> definitely wrong, btw.
> 

Understand.

Is it OK to add a tracepoint here? I think tracepoint is more generic
than retsnoop-like way.

Thanks,
Leon