Re: [RFC PATCH 1/1] x86/ftrace: make ftrace_int3_handler() not to skip fops invocation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I just found this in my Inbox, and this looks to be a legit issue.

On Thu, 26 Jul 2018 12:40:29 +0200
Nicolai Stange <nstange@xxxxxxx> wrote:

Nicolai,

You still working on this?

> With dynamic ftrace, ftrace patches call sites in a three steps:
> 1. Put a breakpoint at the to be patched location,
> 2. update call site and
> 3. finally remove the breakpoint again.
> 
> Note that the breakpoint ftrace_int3_handler() simply makes execution
> to skip over the to be patched function.
> 
> This patching happens in the following circumstances:
> - the global ftrace_trace_function changes and the call sites at
>   ftrace_call and ftrace_regs_call get patched,
> - an ftrace_ops' ->func changes and the call site in its trampoline gets
>   patched,
> - graph tracing gets enabled/disabled and the jump site at
>   ftrace_graph_call gets patched
> - a mcount site gets converted from nop -> call, call -> nop
>   or call -> call.
> 
> The latter case, i.e. a mcount call getting redirected, happens for example
> in a transition from trampolined to not trampolined upon a user enabling
> function tracing on a live patched function.
> 
> The ftrace_int3_handler() simply skipping over the mcount callsite is
> problematic here, because it means that a live patched function gets
> temporarily reverted to its unpatched original and this breaks live
> patching consistency.
> 
> Make ftrace_int3_handler not to skip over the fops invocation, but modify
> the interrupted control flow to issue a call as needed.
> 
> For determining what the proper action actually is, modify
> update_ftrace_func() to take an extra argument, func, to be called if the
> corresponding breakpoint gets hit. Introduce a new global variable,
> trace_update_func_dest and let update_ftrace_func() set it. For the special
> case of patching the jmp insn at ftrace_graph_call, set it to zero and make
> ftrace_int3_handler() just skip over the insn in this case as before.
> 
> Because there's no space left above the int3 handler's iret frame to stash
> an extra call's return address in, this can't be mimicked from the
> ftrace_int3_handler() itself.
> 
> Instead, make ftrace_int3_handler() change the iret frame's %rip slot to
> point to the new ftrace_int3_call_trampoline to be executed immediately
> after iret.
> 
> The original %rip gets communicated to ftrace_int3_call_trampoline which
> can then take proper action. Note that ftrace_int3_call_trampoline can
> nest, because of NMIs, for example. In order to avoid introducing another
> stack, abuse %r11 for passing the original %rip. This is safe, because the
> interrupted context is always at a call insn and according to the x86_64
> ELF ABI allows %r11 is callee-clobbered. According to the glibc sources,
> this is also true for the somewhat special mcount/fentry ABI.
> 
> OTOH, a spare register doesn't exist on i686 and thus, this is approach is
> limited to x86_64.
> 
> For determining the call target address, let ftrace_int3_call_trampoline
> compare ftrace_update_func against the interrupted %rip.

I don't think we need to do the compare.

> 
> If they match, an update_ftrace_func() is currently pending and the
> call site is either of ftrace_call, ftrace_regs_call or the call insn
> within a fops' trampoline. Jump to the ftrace_update_func_dest location as
> set by update_ftrace_func().
> 
> If OTOH the interrupted %rip doesn't equal ftrace_update_func, then
> it points to an mcount call site. In this case, redirect control flow to
> the most generic handler, ftrace_regs_caller, which will then do the right
> thing.
> 
> Finally, reading ftrace_update_func and ftrace_update_func_dest from
> outside of the int3 handler requires some care: inbetween adding and
> removing the breakpoints, ftrace invokes run_sync() which basically
> issues a couple of IPIs. Because the int3 handler has interrupts disabled,
> it orders with run_sync().
> 
> To extend that ordering to also include ftrace_int3_call_trampoline, make
> ftrace_int3_handler() disable interrupts within the iret frame returning to
> it.
> 
> Store the original EFLAGS.IF into %r11's MSB which, because it represents
> a kernel address, is redundant. Make ftrace_int3_call_trampoline restore
> it when done.

This can be made much simpler by making a hardcoded ftrace_int3_tramp
that does the following:

ftrace_int3_tramp
	push	%r11
	jmp	ftrace_caller


The ftrace_caller will either call a single ftrace callback, if there's
only a single ops registered, or it will call the loop function that
iterates over all the ftrace_ops registered and will call each function
that matches the ftrace_ops hashes.

In either case, it's what we want.

The ftrace_int3_tramp will simply simulate the call ftrace_caller that
would be there as the default caller if more than one function is set
to it.

For 32 bit, we could add 4 variables on the thread_info and make 4
trampolines, one for each context (normal, softirq, irq and NMI), and
have them use the variable stored in the thread_info for that location:

ftrace_int3_tramp_normal
	push %eax # just for space
	push %eax
	mov whatever_to_get_thread_info, %eax
	mov normal(%eax), %eax
	mov %eax, 4(%esp)
	pop %eax
	jmp ftrace_caller

ftrace_int3_tramp_softiqr
	push %eax # just for space
	push %eax
	mov whatever_to_get_thread_info, %eax
	mov softirq(%eax), %eax
	mov %eax, 4(%esp)
	pop %eax
	jmp ftrace_caller

etc..


A bit crazier for 32 bit but it can still be done. ;-)

-- Steve



> 
> Signed-off-by: Nicolai Stange <nstange@xxxxxxx>
> ---
>  arch/x86/kernel/ftrace.c    | 48 ++++++++++++++++++++++++++++++++------
>  arch/x86/kernel/ftrace_64.S | 56 +++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 97 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> index 01ebcb6f263e..3908a73370a8 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -227,9 +227,11 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
>  	return -EINVAL;
>  }
>  
> -static unsigned long ftrace_update_func;
> +unsigned long ftrace_update_func;
> +unsigned long ftrace_update_func_dest;
>  
> -static int update_ftrace_func(unsigned long ip, void *new)
> +static int update_ftrace_func(unsigned long ip, void *new,
> +			      unsigned long func)
>  {
>  	unsigned char old[MCOUNT_INSN_SIZE];
>  	int ret;
> @@ -237,6 +239,8 @@ static int update_ftrace_func(unsigned long ip, void *new)
>  	memcpy(old, (void *)ip, MCOUNT_INSN_SIZE);
>  
>  	ftrace_update_func = ip;
> +	ftrace_update_func_dest = func;
> +
>  	/* Make sure the breakpoints see the ftrace_update_func update */
>  	smp_wmb();
>  
> @@ -257,13 +261,13 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
>  	int ret;
>  
>  	new = ftrace_call_replace(ip, (unsigned long)func);
> -	ret = update_ftrace_func(ip, new);
> +	ret = update_ftrace_func(ip, new, (unsigned long)func);
>  
>  	/* Also update the regs callback function */
>  	if (!ret) {
>  		ip = (unsigned long)(&ftrace_regs_call);
>  		new = ftrace_call_replace(ip, (unsigned long)func);
> -		ret = update_ftrace_func(ip, new);
> +		ret = update_ftrace_func(ip, new, (unsigned long)func);
>  	}
>  
>  	return ret;
> @@ -277,6 +281,10 @@ static int is_ftrace_caller(unsigned long ip)
>  	return 0;
>  }
>  
> +#if IS_ENABLED(CONFIG_X86_64)
> +extern char ftrace_int3_call_trampoline[];
> +#endif
> +
>  /*
>   * A breakpoint was added to the code address we are about to
>   * modify, and this is the handle that will just skip over it.
> @@ -287,6 +295,7 @@ static int is_ftrace_caller(unsigned long ip)
>  int ftrace_int3_handler(struct pt_regs *regs)
>  {
>  	unsigned long ip;
> +	unsigned long __maybe_unused eflags_if;
>  
>  	if (WARN_ON_ONCE(!regs))
>  		return 0;
> @@ -295,8 +304,33 @@ int ftrace_int3_handler(struct pt_regs *regs)
>  	if (!ftrace_location(ip) && !is_ftrace_caller(ip))
>  		return 0;
>  
> -	regs->ip += MCOUNT_INSN_SIZE - 1;
> +#if IS_ENABLED(CONFIG_X86_64)
> +	if (is_ftrace_caller(ip) && !ftrace_update_func_dest) {
> +		/*
> +		 * There's an update on ftrace_graph_call pending.
> +		 * Just skip over it.
> +		 */
> +		regs->ip += MCOUNT_INSN_SIZE - 1;
> +		return 1;
> +	}
>  
> +	/*
> +	 * Return to ftrace_int3_call_trampoline with interrupts
> +	 * disabled in order to block ftrace's run_sync() IPIs
> +	 * and keep ftrace_update_func_dest valid.
> +	 */
> +	eflags_if = regs->flags & X86_EFLAGS_IF;
> +	regs->flags ^= eflags_if;
> +	/*
> +	 * The MSB of ip, which is a kernel address, is always one.
> +	 * Abuse it to store EFLAGS.IF there.
> +	 */
> +	ip &= eflags_if << (BITS_PER_LONG - 1 - X86_EFLAGS_IF_BIT);
> +	regs->r11 = ip;
> +	regs->ip = (unsigned long)ftrace_int3_call_trampoline;
> +#else /* !IS_ENABLED(CONFIG_X86_64) */
> +	regs->ip += MCOUNT_INSN_SIZE - 1;
> +#endif
>  	return 1;
>  }
>  
> @@ -870,7 +904,7 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
>  
>  	/* Do a safe modify in case the trampoline is executing */
>  	new = ftrace_call_replace(ip, (unsigned long)func);
> -	ret = update_ftrace_func(ip, new);
> +	ret = update_ftrace_func(ip, new, (unsigned long)func);
>  	set_memory_ro(ops->trampoline, npages);
>  
>  	/* The update should never fail */
> @@ -966,7 +1000,7 @@ static int ftrace_mod_jmp(unsigned long ip, void *func)
>  
>  	new = ftrace_jmp_replace(ip, (unsigned long)func);
>  
> -	return update_ftrace_func(ip, new);
> +	return update_ftrace_func(ip, new, 0);
>  }
>  
>  int ftrace_enable_ftrace_graph_caller(void)
> diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
> index 91b2cff4b79a..b51d4fb9af79 100644
> --- a/arch/x86/kernel/ftrace_64.S
> +++ b/arch/x86/kernel/ftrace_64.S
> @@ -9,6 +9,7 @@
>  #include <asm/export.h>
>  #include <asm/nospec-branch.h>
>  #include <asm/unwind_hints.h>
> +#include <asm/irqflags.h>
>  
>  	.code64
>  	.section .entry.text, "ax"
> @@ -262,6 +263,61 @@ GLOBAL(ftrace_regs_caller_end)
>  ENDPROC(ftrace_regs_caller)
>  
>  
> +ENTRY(ftrace_int3_call_trampoline)
> +	/*
> +	 * A breakpoint was hit on what either had been or will become
> +	 * a call insn. Mimic the function call here: push the desired
> +	 * return address on the stack and jmp to the target function.
> +	 * The inverted value of EFLAGS.IF is stored in %r11's high bit.
> +	 * The remaining bits of %r11 store the break point address
> +	 * (whose MSB is known to always be set, because it's a kernel
> +	 * address).
> +	 */
> +	UNWIND_HINT_EMPTY
> +	subq $8, %rsp
> +	pushq %rax
> +	movq $1, %rax
> +	shlq $63, %rax
> +	orq %r11, %rax   /* ip is in %rax now */
> +
> +	/* Prepare the on-stack return address. */
> +	pushq %rax
> +	addq $5, %rax    /* ip + MCOUNT_INSN_SIZE */
> +	movq %rax, 16(%rsp)
> +	popq %rax
> +
> +	/*
> +	 * Determine where to redirect control flow to. There are
> +	 * two cases:
> +	 * 1.) The breakpoint is at either of ftrace_call,
> +	 *     ftrace_regs_call or the callsite within a fops' trampoline.
> +	 * 2.) The breakpoint is at an mcount callsite.
> +	 *
> +	 * For the first case, update_ftrace_func has setup
> +	 * ftrace_update_func_dest to the target address.
> +	 * For the second case, just branch to the most generic
> +	 * ftrace_regs_caller which will then do the right thing.
> +	 */
> +	cmpq %rax, ftrace_update_func(%rip)
> +	jne 1f
> +	movq ftrace_update_func_dest(%rip), %rax
> +	jmp 2f
> +1:
> +	leaq ftrace_regs_caller(%rip), %rax
> +2:
> +	/* Restore EFLAGS.IF */
> +	btq $63, %r11
> +	jnc 3f
> +	sti
> +	TRACE_IRQS_ON
> +3:
> +	mov %rax, %r11
> +	popq %rax
> +
> +	JMP_NOSPEC %r11
> +END(ftrace_int3_call_trampoline)
> +
> +
>  #else /* ! CONFIG_DYNAMIC_FTRACE */
>  
>  ENTRY(function_hook)




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux Kernel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux