Re: [PATCH -tip 0/5] kprobes: Fix stacktrace in kretprobes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 5 Mar 2021 11:16:45 -0800
Daniel Xu <dxu@xxxxxxxxx> wrote:

> Hi Masami,
> 
> On Sat, Mar 06, 2021 at 12:38:57AM +0900, Masami Hiramatsu wrote:
> > Hello,
> > 
> > Here is a series of patches for kprobes and stacktracer to fix the kretprobe
> > entries in the kernel stack. This was reported by Daniel Xu. I thought that
> > was in the bpftrace, but it is actually more generic issue.
> > So I decided to fix the issue in arch independent part.
> > 
> > While fixing the issue, I found a bug in ia64 related to kretprobe, which is
> > fixed by [1/5]. [2/5] and [3/5] is a kind of cleanup before fixing the main
> > issue. [4/5] is the patch to fix the stacktrace, which involves kretprobe
> > internal change. And [5/5] removing the stacktrace kretprobe fixup code in
> > ftrace. 
> > 
> > Daniel, can you also check that this fixes your issue too? I hope it is.
> 
> Unfortunately, this patch series does not fix the issue I reported.

Ah, OK. This was my mistake I didn't choose ORC unwinder for test kernel.

> 
> I think the reason your tests work is because you're using ftrace and
> the ORC unwinder is aware of ftrace trampolines (see
> arch/x86/kernel/unwind_orc.c:orc_ftrace_find).

OK, so it has to be fixed in ORC unwinder.

> 
> bpftrace kprobes go through perf event subsystem (ie not ftrace) so
> naturally orc_ftrace_find() does not find an associated trampoline. ORC
> unwinding fails in this case because
> arch/x86/kernel/kprobes/core.c:trampoline_handler sets
> 
>     regs->ip = (unsigned long)&kretprobe_trampoline;
> 
> and `kretprobe_trampoline` is marked
> 
>     STACK_FRAME_NON_STANDARD(kretprobe_trampoline);
> 
> so it doesn't have a valid ORC entry. Thus, ORC immediately bails when
> trying to unwind past the first frame.

Hm, in ftrace case, it decode kretprobe's stackframe and stoped right
after kretprobe_trampoline callsite.

 => kretprobe_trace_func+0x21f/0x340
 => kretprobe_dispatcher+0x73/0xb0
 => __kretprobe_trampoline_handler+0xcd/0x1a0
 => trampoline_handler+0x3d/0x50
 => kretprobe_trampoline+0x25/0x50
 => 0

kretprobe replaces the real return address with kretprobe_trampoline
and kretprobe_trampoline *calls* trampoline_handler (this part depends
on architecture implementation).
Thus, if kretprobe_trampoline has no stack frame information, ORC may
fails at the first kretprobe_trampoline+0x25, that is different from
the kretprobe_trampoline+0, so the hack doesn't work.

Hmm, how the other inline-asm function makes its stackframe info?
If I get the kretprobe_trampoline+0, I can fix it up.

> The only way I can think of to fix this issue is to make the ORC
> unwinder aware of kretprobe (ie the patch I sent earlier). I'm hoping
> you have another idea if my patch isn't acceptable.

OK, anyway, your patch doesn't care the case that the multiple functions
are probed by kretprobes. In that case, you'll see several entries are
replaced by the kretprobe_trampoline. To find it correctly, you have
to pass a state-holder (@cur of the kretprobe_find_ret_addr())
to the fixup entries.

Thank you,

-- 
Masami Hiramatsu <mhiramat@xxxxxxxxxx>



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux