Re: [PATCH bpf] bpf, arm64: Fix stack frame construction for struct_ops trampoline

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 24, 2024 at 6:48 AM Xu Kuohai <xukuohai@xxxxxxxxxxxxxxx> wrote:
>
> On 10/23/2024 11:16 AM, Xu Kuohai wrote:
> > On 10/23/2024 7:37 AM, Puranjay Mohan wrote:
> >> On Wed, Oct 23, 2024 at 12:50 AM Alexei Starovoitov
> >> <alexei.starovoitov@xxxxxxxxx> wrote:
> >>>
> >>> On Sat, Oct 19, 2024 at 2:15 AM Xu Kuohai <xukuohai@xxxxxxxxxxxxxxx> wrote:
> >>>>
> >>>> From: Xu Kuohai <xukuohai@xxxxxxxxxx>
> >>>>
> >>>> The callsite layout for arm64 fentry is:
> >>>>
> >>>> mov x9, lr
> >>>> nop
> >>>>
> >>>> When a bpf prog is attached, the nop instruction is patched to a call
> >>>> to bpf trampoline:
> >>>>
> >>>> mov x9, lr
> >>>> bl <bpf trampoline>
> >>>>
> >>>> This passes two return addresses to bpf trampoline: the return address
> >>>> for the traced function/prog, stored in x9, and the return address for
> >>>> the bpf trampoline, stored in lr. To ensure stacktrace works properly,
> >>>> the bpf trampoline constructs two fake function stack frames using x9
> >>>> and lr.
> >>>>
> >>>> However, struct_ops progs are used as function callbacks and are invoked
> >>>> directly, without x9 being set as the fentry callsite does. Therefore,
> >>>> only one stack frame should be constructed using lr for struct_ops.
> >>>
> >>> Are you saying that currently stack unwinder on arm64 is
> >>> completely broken for struct_ops progs ?
> >>> or it shows an extra frame that doesn't have to be shown ?
> >>>
> >>> If former then it's certainly a bpf tree material.
> >>> If latter then bpf-next will do.
> >>> Pls clarify.
> >>
> >> It is not completely broken, only an extra garbage frame is shown
> >> between the caller of the trampoline and its caller.
> >>
> >
> > Yep, exactly. Here is a perf script sample, where tcp_ack+0x404
> > is the garbage frame.
> >
> > ffffffc0801a04b4 bpf_prog_50992e55a0f655a9_bpf_cubic_cong_avoid+0x98 (bpf_prog_50992e55a0f655a9_bpf_cubic_cong_avoid)
> > ffffffc0801a228c [unknown] ([kernel.kallsyms]) // bpf trampoline
> > ffffffd08d362590 tcp_ack+0x798 ([kernel.kallsyms]) // caller for bpf trampoline
> > ffffffd08d3621fc tcp_ack+0x404 ([kernel.kallsyms]) // garbage frame
> > ffffffd08d36452c tcp_rcv_established+0x4ac ([kernel.kallsyms])
> > ffffffd08d375c58 tcp_v4_do_rcv+0x1f0 ([kernel.kallsyms])
> > ffffffd08d378630 tcp_v4_rcv+0xeb8 ([kernel.kallsyms])
> > ...
> >
> > And this sample also shows that there is no symbol for the
> > struct_ops bpf trampoline. Maybe we should add symbol for it?
> >
>
> Emm, stack unwinder on x86 is completely broken for struct_ops
> progs.
>
> It's because the following function returns 0 for a struct_ops
> bpf trampoline address as there is no corresponding kernel symbol,
> which causes the address not to be recognized as kerneltext. As
> a result, the winder stops on ip == 0.
>
> unsigned long unwind_get_return_address(struct unwind_state *state)
> {
>          if (unwind_done(state))
>                  return 0;
>
>          return __kernel_text_address(state->ip) ? state->ip : 0;
> }
>
> Here is an example of broken stack trace from perf sampling, where
> only one stack frame is captured:
>
> ffffffffc000cfb4 bpf_prog_e60d93d3ec88d5ef_bpf_cubic_cong_avoid+0x78 (bpf_prog_e60d93d3ec88d5ef_bpf_cubic_cong_avoid)
> (no more frames)

you mean arch_stack_walk() won't see anything after ip=0,
but dump_stack() will still print the rest with "?" (as unreliable).

That's bad indeed.

> To fix it, I think kernel symbol should be added for struct_ops
> trampoline.

Makes sense. Pls send a patch.


As far as this patch please add an earlier example of double tcp_ack trace
to commit log and resubmit targeting bpf-next.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux