On Sat, Mar 06, 2021 at 10:13:57AM +0900, Masami Hiramatsu wrote: > On Fri, 5 Mar 2021 11:16:45 -0800 > Daniel Xu <dxu@xxxxxxxxx> wrote: > > > Hi Masami, > > > > On Sat, Mar 06, 2021 at 12:38:57AM +0900, Masami Hiramatsu wrote: > > > Hello, > > > > > > Here is a series of patches for kprobes and stacktracer to fix the kretprobe > > > entries in the kernel stack. This was reported by Daniel Xu. I thought that > > > was in the bpftrace, but it is actually more generic issue. > > > So I decided to fix the issue in arch independent part. > > > > > > While fixing the issue, I found a bug in ia64 related to kretprobe, which is > > > fixed by [1/5]. [2/5] and [3/5] is a kind of cleanup before fixing the main > > > issue. [4/5] is the patch to fix the stacktrace, which involves kretprobe > > > internal change. And [5/5] removing the stacktrace kretprobe fixup code in > > > ftrace. > > > > > > Daniel, can you also check that this fixes your issue too? I hope it is. > > > > Unfortunately, this patch series does not fix the issue I reported. > > Ah, OK. This was my mistake I didn't choose ORC unwinder for test kernel. > > > > > I think the reason your tests work is because you're using ftrace and > > the ORC unwinder is aware of ftrace trampolines (see > > arch/x86/kernel/unwind_orc.c:orc_ftrace_find). > > OK, so it has to be fixed in ORC unwinder. > > > > > bpftrace kprobes go through perf event subsystem (ie not ftrace) so > > naturally orc_ftrace_find() does not find an associated trampoline. ORC > > unwinding fails in this case because > > arch/x86/kernel/kprobes/core.c:trampoline_handler sets > > > > regs->ip = (unsigned long)&kretprobe_trampoline; > > > > and `kretprobe_trampoline` is marked > > > > STACK_FRAME_NON_STANDARD(kretprobe_trampoline); > > > > so it doesn't have a valid ORC entry. Thus, ORC immediately bails when > > trying to unwind past the first frame. > > Hm, in ftrace case, it decode kretprobe's stackframe and stoped right > after kretprobe_trampoline callsite. > > => kretprobe_trace_func+0x21f/0x340 > => kretprobe_dispatcher+0x73/0xb0 > => __kretprobe_trampoline_handler+0xcd/0x1a0 > => trampoline_handler+0x3d/0x50 > => kretprobe_trampoline+0x25/0x50 > => 0 > > kretprobe replaces the real return address with kretprobe_trampoline > and kretprobe_trampoline *calls* trampoline_handler (this part depends > on architecture implementation). > Thus, if kretprobe_trampoline has no stack frame information, ORC may > fails at the first kretprobe_trampoline+0x25, that is different from > the kretprobe_trampoline+0, so the hack doesn't work. I'm not sure I follow 100% what you're saying, but assuming you're asking why bpftrace fails at `kretprobe_trampoline+0` instead of `kretprobe_trampoline+0x25`: `regs` is set to &kretprobe_trampoline: regs->ip = (unsigned long)&kretprobe_trampoline; Then the kretprobe event is dispatched like this: kretprobe_trampoline_handler __kretprobe_trampoline_handler rp->handler // ie kernel/trace/trace_kprobe.c:kretprobe_dispatcher kretprobe_perf_func trace_bpf_call(.., regs) BPF_PROG_RUN_ARRAY_CHECK bpf_get_stackid(regs, .., ..) // in bpftrace prog And then `bpf_get_stackid` unwinds the stack via: bpf_get_stackid get_perf_callchain(regs, ...) perf_callchain_kernel(.., regs) perf_callchain_store(.., regs->ip) // !!! first unwound entry unwind_start unwind_next_frame In summary: unwinding via BPF begins at regs->ip, which `trampoline_handler` sets to `&kretprobe_trampoline+0x0`. > Hmm, how the other inline-asm function makes its stackframe info? > If I get the kretprobe_trampoline+0, I can fix it up. So I think I misunderstood the mechanics before. I think `call trampoline_handler` does set up a new frame. My current guess is that ftrace doesn't thread through `regs` like the bpf code path does. I'm not very familiar with ftrace internals so I haven't looked. > > > The only way I can think of to fix this issue is to make the ORC > > unwinder aware of kretprobe (ie the patch I sent earlier). I'm hoping > > you have another idea if my patch isn't acceptable. > > OK, anyway, your patch doesn't care the case that the multiple functions > are probed by kretprobes. In that case, you'll see several entries are > replaced by the kretprobe_trampoline. To find it correctly, you have > to pass a state-holder (@cur of the kretprobe_find_ret_addr()) > to the fixup entries. I'll see if I can figure something out tomorrow. Thanks, Daniel