Re: [PATCH 2/4] x86/kprobes: Fix frame pointer annotations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 9 May 2019 10:14:31 +0200
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Thu, May 09, 2019 at 10:20:30AM +0900, Masami Hiramatsu wrote:
> > Hi Josh,
> > 
> > On Wed, 8 May 2019 13:48:48 -0500
> > Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> > 
> > > On Wed, May 08, 2019 at 05:39:07PM +0200, Peter Zijlstra wrote:
> > > > On Wed, May 08, 2019 at 07:42:48AM -0500, Josh Poimboeuf wrote:
> > > > > On Wed, May 08, 2019 at 02:04:16PM +0200, Peter Zijlstra wrote:
> > > > 
> > > > > > Do the x86_64 variants also want some ORC annotation?
> > > > > 
> > > > > Maybe so.  Though it looks like regs->ip isn't saved.  The saved
> > > > > registers might need to be tweaked.  I'll need to look into it.
> > > > 
> > > > What all these sites do (and maybe we should look at unifying them
> > > > somehow) is turn a CALL frame (aka RET-IP) into an exception frame (aka
> > > > pt_regs).
> > > > 
> > > > So regs->ip will be the return address (which is fixed up to be the CALL
> > > > address in the handler).
> > > 
> > > But from what I can tell, trampoline_handler() hard-codes regs->ip to
> > > point to kretprobe_trampoline(), and the original return address is
> > > placed in regs->sp.
> > > 
> > > Masami, is there a reason why regs->ip doesn't have the original return
> > > address and regs->sp doesn't have the original SP?  I think that would
> > > help the unwinder understand things.
> > 
> > Yes, for regs->ip, there is a histrical reason. Since previously, we had
> > an int3 at trampoline, so the user (kretprobe) handler expects that
> > regs->ip is trampoline address and ri->ret_addr is original return address.
> > It is better to check the other archs, but I think it is possible to
> > change the regs->ip to original return address, since no one cares such
> > "fixed address". :)
> > 
> > For the regs->sp, there are 2 reasons.
> > 
> > For x86-64, it's just for over-optimizing (reduce stack usage).
> > I think we can make a gap for putting return address, something like
> > 
> > 	"kretprobe_trampoline:\n"
> > #ifdef CONFIG_X86_64
> > 	"	pushq %rsp\n"	/* Make a gap for return address */
> > 	"	pushq 0(%rsp)\n"	/* Copy original stack pointer */
> > 	"	pushfq\n"
> > 	SAVE_REGS_STRING
> > 	"	movq %rsp, %rdi\n"
> > 	"	call trampoline_handler\n"
> > 	/* Push the true return address to the bottom */
> > 	"	movq %rax, 20*8(%rsp)\n"
> > 	RESTORE_REGS_STRING
> > 	"	popfq\n"
> > 	"	addq $8, %rsp\n"	/* Skip original stack pointer */
> > 
> > For i386 (x86-32), there is no other way to keep &regs->sp as
> > the original stack pointer. It has to be changed with this series,
> > maybe as same as x86-64.
> 
> Right; I already fixed that in my patch changing i386's pt_regs.

I see it, and it is good to me. :)

> But what I'd love to do is something like the belwo patch, and make all
> the trampolines (very much including ftrace) use that. Such that we then
> only have 1 copy of this magic (well, 2 because x86_64 also needs an
> implementation of this of course).

OK, but I will make kretprobe integrated with func-graph tracer,
since it is inefficient that we have 2 different hidden return stack...

Anyway,

> Changing ftrace over to this would be a little more work but it can
> easily chain things a little to get its original context back:
> 
> ENTRY(ftrace_regs_caller)
> GLOBAL(ftrace_regs_func)
> 	push ftrace_stub
> 	push ftrace_regs_handler
> 	jmp call_to_exception_trampoline
> END(ftrace_regs_caller)
> 
> typedef void (*ftrace_func_t)(unsigned long, unsigned long, struct ftrace_op *, struct pt_regs *);
> 
> struct ftrace_regs_stack {
> 	ftrace_func_t func;
> 	unsigned long parent_ip;
> };
> 
> void ftrace_regs_handler(struct pr_regs *regs)
> {
> 	struct ftrace_regs_stack *st = (void *)regs->sp;
> 	ftrace_func_t func = st->func;
> 
> 	regs->sp += sizeof(long); /* pop func */

Sorry, why pop here? 

> 
> 	func(regs->ip, st->parent_ip, function_trace_op, regs);
> }
> 
> Hmm? I didn't look into the function_graph thing, but I imagine it can
> be added without too much pain.

Yes, that should be good for function_graph trampoline too.
We use very similar technic.

> 
> ---
> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -1576,3 +1576,100 @@ ENTRY(rewind_stack_do_exit)
>  	call	do_exit
>  1:	jmp 1b
>  END(rewind_stack_do_exit)
> +
> +/*
> + * Transforms a CALL frame into an exception frame; IOW it pretends the CALL we
> + * just did was in fact scribbled with an INT3.
> + *
> + * Use this trampoline like:
> + *
> + *   PUSH $func
> + *   JMP call_to_exception_trampoline
> + *
> + * $func will see regs->ip point at the CALL instruction and must therefore
> + * modify regs->ip in order to make progress (just like a normal INT3 scribbled
> + * CALL).
> + *
> + * NOTE: we do not restore any of the segment registers.
> + */
> +ENTRY(call_to_exception_trampoline)
> +	/*
> +	 * On entry the stack looks like:
> +	 *
> +	 *   2*4(%esp) <previous context>
> +	 *   1*4(%esp) RET-IP
> +	 *   0*4(%esp) func
> +	 *
> +	 * transform this into:
> +	 *
> +	 *  19*4(%esp) <previous context>
> +	 *  18*4(%esp) gap / RET-IP
> +	 *  17*4(%esp) gap / func
> +	 *  16*4(%esp) ss
> +	 *  15*415*4(%esp) sp / <previous context>

isn't this "&<previous context>" ?

> +	 *  14*4(%esp) flags
> +	 *  13*4(%esp) cs
> +	 *  12*4(%esp) ip / RET-IP
> +	 *  11*4(%esp) orig_eax
> +	 *  10*4(%esp) gs
> +	 *   9*4(%esp) fs
> +	 *   8*4(%esp) es
> +	 *   7*4(%esp) ds
> +	 *   6*4(%esp) eax
> +	 *   5*4(%esp) ebp
> +	 *   4*4(%esp) edi
> +	 *   3*4(%esp) esi
> +	 *   2*4(%esp) edx
> +	 *   1*4(%esp) ecx
> +	 *   0*4(%esp) ebx
> +	 */
> +	pushl	%ss
> +	pushl	%esp		# points at ss
> +	addl	$3*4, (%esp)	#   point it at <previous context>
> +	pushfl
> +	pushl	%cs
> +	pushl	5*4(%esp)	# RET-IP
> +	subl	5, (%esp)	#   point at CALL instruction
> +	pushl	$-1
> +	pushl	%gs
> +	pushl	%fs
> +	pushl	%es
> +	pushl	%ds
> +	pushl	%eax
> +	pushl	%ebp
> +	pushl	%edi
> +	pushl	%esi
> +	pushl	%edx
> +	pushl	%ecx
> +	pushl	%ebx
> +
> +	ENCODE_FRAME_POINTER
> +
> +	movl	%esp, %eax	# 1st argument: pt_regs
> +
> +	movl	17*4(%esp), %ebx	# func
> +	CALL_NOSPEC %ebx
> +
> +	movl	PT_OLDESP(%esp), %eax

Is PT_OLDESP(%esp) "<previous context>" or "&<previous contex>"?

> +
> +	movl	PT_EIP(%esp), %ecx
> +	movl	%ecx, -1*4(%eax)

Ah, OK, so $func must set the true return address to regs->ip
instead of returning it.

> +
> +	movl	PT_EFLAGS(%esp), %ecx
> +	movl	%ecx, -2*4(%eax)
> +
> +	movl	PT_EAX(%esp), %ecx
> +	movl	%ecx, -3*4(%eax)

So, at this point, the stack becomes

 18*4(%esp) RET-IP
 17*4(%esp) eflags
 16*4(%esp) eax

Correct?

> +
> +	popl	%ebx
> +	popl	%ecx
> +	popl	%edx
> +	popl	%esi
> +	popl	%edi
> +	popl	%ebp
> +
> +	lea	-3*4(%eax), %esp
> +	popl	%eax
> +	popfl
> +	ret
> +END(call_to_exception_trampoline)
> --- a/arch/x86/kernel/kprobes/core.c
> +++ b/arch/x86/kernel/kprobes/core.c
> @@ -731,29 +731,8 @@ asm(
>  	".global kretprobe_trampoline\n"
>  	".type kretprobe_trampoline, @function\n"
>  	"kretprobe_trampoline:\n"
> -	/* We don't bother saving the ss register */
> -#ifdef CONFIG_X86_64
> -	"	pushq %rsp\n"
> -	"	pushfq\n"
> -	SAVE_REGS_STRING
> -	"	movq %rsp, %rdi\n"
> -	"	call trampoline_handler\n"
> -	/* Replace saved sp with true return address. */
> -	"	movq %rax, 19*8(%rsp)\n"
> -	RESTORE_REGS_STRING
> -	"	popfq\n"
> -#else
> -	"	pushl %esp\n"
> -	"	pushfl\n"
> -	SAVE_REGS_STRING
> -	"	movl %esp, %eax\n"
> -	"	call trampoline_handler\n"
> -	/* Replace saved sp with true return address. */
> -	"	movl %eax, 15*4(%esp)\n"
> -	RESTORE_REGS_STRING
> -	"	popfl\n"
> -#endif
> -	"	ret\n"

Here, we need a gap for storing ret-ip, because kretprobe_trampoline is
the address which is returned from the target function. We have no 
"ret-ip" here at this point. So something like

+	"push $0\n"	/* This is a gap, will be filled with real return address*/

> +	"push trampoline_handler\n"
> +	"jmp call_to_exception_trampoline\n"
>  	".size kretprobe_trampoline, .-kretprobe_trampoline\n"
>  );
>  NOKPROBE_SYMBOL(kretprobe_trampoline);
> @@ -791,12 +770,7 @@ static __used void *trampoline_handler(s
>  
>  	INIT_HLIST_HEAD(&empty_rp);
>  	kretprobe_hash_lock(current, &head, &flags);
> -	/* fixup registers */
> -	regs->cs = __KERNEL_CS;
> -#ifdef CONFIG_X86_32
> -	regs->cs |= get_kernel_rpl();
> -	regs->gs = 0;
> -#endif
> +
>  	/* We use pt_regs->sp for return address holder. */
>  	frame_pointer = &regs->sp;
>  	regs->ip = trampoline_address;

Thank you,

-- 
Masami Hiramatsu <mhiramat@xxxxxxxxxx>



[Index of Archives]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux