Re: [PATCH bpf-next v2] bpf: Add bpf_check_attach_target_with_klog method to output failure logs to kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 25, 2024 at 7:57 PM Leon Hwang <hffilwlqm@xxxxxxxxx> wrote:
>
>
>
> On 26/7/24 05:27, Andrii Nakryiko wrote:
> > On Thu, Jul 25, 2024 at 12:33 AM Leon Hwang <hffilwlqm@xxxxxxxxx> wrote:
> >>
> >>
> >>
> >> On 25/7/24 14:09, Yonghong Song wrote:
> >>>
> >>> On 7/24/24 11:05 PM, Leon Hwang wrote:
> >>>>
> >>>> On 25/7/24 13:54, Yonghong Song wrote:
> >>>>> On 7/24/24 10:15 PM, Zheao Li wrote:
> >>>>>> This is a v2 patch, previous Link:
> >>>>>> https://lore.kernel.org/bpf/20240724152521.20546-1-me@xxxxxxxxxxxx/T/#u
> >>>>>>
>
> [SNI]
>

[...]

> >
>
> Build and run, sudo ./retsnoop -e verbose -e bpf_log -e
> bpf_verifier_vlog -e bpf_verifier_log_write -STA -v, here's the output:
>
>
> FUNCTION CALLS   RESULT  DURATION  ARGS
> --------------   ------  --------  ----
> ↔ bpf_log        [void]   1.350us  log=NULL fmt='%s() is not a global
> function ' =(vararg)
>
> It's great to show arguments.
>

Thanks for repro steps, they worked. Also, I just pushed latest
retsnoop version to Github that does support capturing vararg
arguments for printf-like functions. See full debugging log at [0],
but I basically did just two things:

$ sudo retsnoop -e '*sys_bpf' --lbr -n freplace

-n freplace filters by process name, to avoid the noise. I traced
bpf() syscall (*sys_bf), and I requested function call LBR (Last
Branch Record) stack. LBR showed that we have
bpf_prog_attach_check_attach_type() call, and then eventually we get
to bpf_log().

So I then traced bpf_log (no --lbr this time, but I requested function
trace + arguments capture:

$ sudo retsnoop -n freplace -e '*sys_bpf' -a bpf_log -TA

17:02:39.968302 -> 17:02:39.968307 TID/PID 2730863/2730855 (freplace/freplace):

FUNCTION CALLS      RESULT     DURATION  ARGS
-----------------   ---------  --------  ----
→ __x64_sys_bpf
regs=&{.r15=2,.r14=0xc0000061c0,.bp=0xc00169f8a8,.bx=28,.r11=514,.ax=0xffffffffffffffda,.cx=0x404f4e,.dx=64,.si=0xc00169fa10…
    → __sys_bpf                          cmd=28
uattr={{.kernel=0xc00169fa10,.user=0xc00169fa10}} size=64
        ↔ bpf_log   [void]      1.550us  log=NULL fmt='%s() is not a
global function ' vararg0='stub_handler_static'
    ← __sys_bpf     [-EINVAL]   4.115us
← __x64_sys_bpf     [-EINVAL]   5.467us


For __x64_sys_bpf that's struct pt_regs, which isn't that interesting,
but then we have:

↔ bpf_log   [void]      1.550us  log=NULL fmt='%s() is not a global
function ' vararg0='stub_handler_static'

Which showed format string and the argument passed to it:
'stub_hanler_static' subprogram seems to be the problem here.


Anyways, tbh, for a problem like this, it's probably best to just
request a verbose log when doing the BPF_PROG_LOAD command. You can
*normally* use veristat tool to get that easily, if you have a .bpf.o
object file on the disk. But in this case it's freplace and veristat
doesn't know what's the target BPF program, so it's not that useful in
this case:

$ sudo veristat -v freplace_bpfel.o
Processing 'freplace_bpfel.o'...
libbpf: prog 'freplace_handler': attach program FD is not set
libbpf: prog 'freplace_handler': failed to prepare load attributes: -22
libbpf: prog 'freplace_handler': failed to load: -22
libbpf: failed to load object 'freplace_bpfel.o'
PROCESSING freplace_bpfel.o/freplace_handler, DURATION US: 0, VERDICT:
failure, VERIFIER LOG:

File              Program           Verdict  Duration (us)  Insns
States  Peak states
----------------  ----------------  -------  -------------  -----
------  -----------
freplace_bpfel.o  freplace_handler  failure              0      0
 0            0
----------------  ----------------  -------  -------------  -----
------  -----------
Done. Processed 1 files, 0 programs. Skipped 1 files, 0 programs.

But for lots of other programs this would be a no-brainer.


  [0] https://gist.github.com/anakryiko/88a1597a68e43dc945e40fde88a96e7e

[...]

>
> Is it OK to add a tracepoint here? I think tracepoint is more generic
> than retsnoop-like way.

I personally don't see a problem with adding tracepoint, but how would
it look like, given we are talking about vararg printf-style function
calls? I'm not sure how that should be represented in such a way as to
make it compatible with tracepoints and not cause any runtime
overhead.

>
> Thanks,
> Leon
>
>





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux