Re: [TEST FAILURE] bpf: s390: missed/kprobe_recursion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 27 Jan 2025 11:09:27 -0800
Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote:

> On Sun, Jan 26, 2025 at 2:06 PM Jiri Olsa <olsajiri@xxxxxxxxx> wrote:
> >
> > On Sun, Jan 26, 2025 at 11:40:05PM +0900, Masami Hiramatsu wrote:
> > > On Fri, 24 Jan 2025 16:41:38 +0100
> > > Jiri Olsa <olsajiri@xxxxxxxxx> wrote:
> > >
> > > > On Fri, Jan 24, 2025 at 12:23:35PM +0100, Jiri Olsa wrote:
> > > > > On Thu, Jan 23, 2025 at 02:32:38PM -0800, Martin KaFai Lau wrote:
> > > > > > Hi Jiri,
> > > > > >
> > > > > > The "missed/kprobe_recursion" fails consistently on s390. It seems to start
> > > > > > failing after the recent bpf and bpf-next tree ffwd.
> > > > > >
> > > > > > An example:
> > > > > > https://github.com/kernel-patches/bpf/actions/runs/12934431612/job/36076956920
> > > > > >
> > > > > > Can you help to take a look?
> > > > > >
> > > > > > afaict, it only happens on s390 so far, so cc IIya if there is any recent
> > > > > > change that may ring the bell.
> > > > >
> > > > > hi,
> > > > > I need to check more but I wonder it's the:
> > > > >   7495e179b478 s390/tracing: Enable HAVE_FTRACE_GRAPH_FUNC
> > > > >
> > > > > which seems to add recursion check and bail out before we have
> > > > > a chance to trigger it in bpf code
> > > >
> > > > so the test attaches bpf program test1 to bpf_fentry_test1 via kprobe.multi
> > > >
> > > >     SEC("kprobe.multi/bpf_fentry_test1")
> > > >     int test1(struct pt_regs *ctx)
> > > >     {
> > > >             bpf_kfunc_common_test();
> > > >             return 0;
> > > >     }
> > > >
> > > > and several other programs are attached to bpf_kfunc_common_test function
> > > >
> > > >
> > > > I can't test this on s390, but looks like following is happening:
> > > >
> > > > kprobe.multi uses fprobe, so the test kernel path goes:
> > > >
> > > >     bpf_fentry_test1
> > > >       ftrace_graph_func
> > > >         function_graph_enter_regs
> > > >        fprobe_entry
> > > >          kprobe_multi_link_prog_run
> > > >            test1 (bpf program)
> > > >              bpf_kfunc_common_test
> > > >                kprobe_ftrace_handler
> > > >                  kprobe_perf_func
> > > >                    trace_call_bpf
> > > >                      -> bpf_prog_active check fails, missed count is incremented
> > > >
> > > >
> > > > kprobe_ftrace_handler calls/takes ftrace_test_recursion_trylock (ftrace recursion lock)
> > > >
> > > > but s390 now calls/takes ftrace_test_recursion_trylock already in ftrace_graph_func,
> > > > so s390 stops at kprobe_ftrace_handler and does not get to trace_call_bpf to increment
> > > > prog->missed counters
> > >
> > > Oops, good catch! I missed to remove it from s390. We've already moved it
> > > in function_graph_enter_regs().
> > >
> > >
> > > >
> > > > adding Sven, Masami, any idea?
> > > >
> > > > if the ftrace_test_recursion_trylock is needed ftrace_graph_func on s390, then
> > > > I think we will need to fix our test to skip s390 arch
> > >
> > > Yes. Please try this patch;
> > >
> > >
> > > From 12fcda79d0b1082449d5f7cfb8039b0237cf246d Mon Sep 17 00:00:00 2001
> > > From: "Masami Hiramatsu (Google)" <mhiramat@xxxxxxxxxx>
> > > Date: Sun, 26 Jan 2025 23:38:59 +0900
> > > Subject: [PATCH] s390: fgraph: Fix to remove ftrace_test_recursion_trylock()
> > >
> > > Fix to remove ftrace_test_recursion_trylock() from ftrace_graph_func()
> > > because commit d576aec24df9 ("fgraph: Get ftrace recursion lock in
> > > function_graph_enter") has been moved it to function_graph_enter_regs()
> > > already.
> > >
> > > Reported-by: Jiri Olsa <olsajiri@xxxxxxxxx>
> > > Fixes: d576aec24df9 ("fgraph: Get ftrace recursion lock in function_graph_enter")
> > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>
> >
> > great, ci is passing with this fix
> >
> > Tested-by: Jiri Olsa <jolsa@xxxxxxxxxx>

Thanks for testing!

> 
> Masami,
> 
> Are you going to land this fix in your tree? We can create a temporary
> patch for BPF CI once you have the commit in the tree.

I think this should be a fix from linux-trace tree. I also found
another issue on s390. (s390 does not implemented )
Let me resend it because I missed to Cc to linux-trace ML.

Thank you,
> 
> >
> > thanks,
> > jirka
> >
> >
> > > ---
> > >  arch/s390/kernel/ftrace.c | 5 -----
> > >  1 file changed, 5 deletions(-)
> > >
> > > diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
> > > index c0b2c97efefb..63ba6306632e 100644
> > > --- a/arch/s390/kernel/ftrace.c
> > > +++ b/arch/s390/kernel/ftrace.c
> > > @@ -266,18 +266,13 @@ void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
> > >                      struct ftrace_ops *op, struct ftrace_regs *fregs)
> > >  {
> > >       unsigned long *parent = &arch_ftrace_regs(fregs)->regs.gprs[14];
> > > -     int bit;
> > >
> > >       if (unlikely(ftrace_graph_is_dead()))
> > >               return;
> > >       if (unlikely(atomic_read(&current->tracing_graph_pause)))
> > >               return;
> > > -     bit = ftrace_test_recursion_trylock(ip, *parent);
> > > -     if (bit < 0)
> > > -             return;
> > >       if (!function_graph_enter_regs(*parent, ip, 0, parent, fregs))
> > >               *parent = (unsigned long)&return_to_handler;
> > > -     ftrace_test_recursion_unlock(bit);
> > >  }
> > >
> > >  #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
> > > --
> > > 2.43.0
> > >
> > > Thank you,
> > >
> > > --
> > > Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>
> >


-- 
Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux