Patch "x86/unwind/orc: Fix unreliable stack dump with gcov" has been added to the 4.14-stable tree

Sasha Levin <sashal@xxxxxxxxxx> · Mon, 31 Oct 2022 11:24:11 -0400

This is a note to let you know that I've just added the patch titled

    x86/unwind/orc: Fix unreliable stack dump with gcov

to the 4.14-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     x86-unwind-orc-fix-unreliable-stack-dump-with-gcov.patch
and it can be found in the queue-4.14 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit d966ca7392482bb2c67c5ec841b2bf27d1282652
Author: Chen Zhongjin <chenzhongjin@xxxxxxxxxx>
Date:   Wed Jul 27 11:15:06 2022 +0800

    x86/unwind/orc: Fix unreliable stack dump with gcov
    
    [ Upstream commit 230db82413c091bc16acee72650f48d419cebe49 ]
    
    When a console stack dump is initiated with CONFIG_GCOV_PROFILE_ALL
    enabled, show_trace_log_lvl() gets out of sync with the ORC unwinder,
    causing the stack trace to show all text addresses as unreliable:
    
      # echo l > /proc/sysrq-trigger
      [  477.521031] sysrq: Show backtrace of all active CPUs
      [  477.523813] NMI backtrace for cpu 0
      [  477.524492] CPU: 0 PID: 1021 Comm: bash Not tainted 6.0.0 #65
      [  477.525295] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014
      [  477.526439] Call Trace:
      [  477.526854]  <TASK>
      [  477.527216]  ? dump_stack_lvl+0xc7/0x114
      [  477.527801]  ? dump_stack+0x13/0x1f
      [  477.528331]  ? nmi_cpu_backtrace.cold+0xb5/0x10d
      [  477.528998]  ? lapic_can_unplug_cpu+0xa0/0xa0
      [  477.529641]  ? nmi_trigger_cpumask_backtrace+0x16a/0x1f0
      [  477.530393]  ? arch_trigger_cpumask_backtrace+0x1d/0x30
      [  477.531136]  ? sysrq_handle_showallcpus+0x1b/0x30
      [  477.531818]  ? __handle_sysrq.cold+0x4e/0x1ae
      [  477.532451]  ? write_sysrq_trigger+0x63/0x80
      [  477.533080]  ? proc_reg_write+0x92/0x110
      [  477.533663]  ? vfs_write+0x174/0x530
      [  477.534265]  ? handle_mm_fault+0x16f/0x500
      [  477.534940]  ? ksys_write+0x7b/0x170
      [  477.535543]  ? __x64_sys_write+0x1d/0x30
      [  477.536191]  ? do_syscall_64+0x6b/0x100
      [  477.536809]  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
      [  477.537609]  </TASK>
    
    This happens when the compiled code for show_stack() has a single word
    on the stack, and doesn't use a tail call to show_stack_log_lvl().
    (CONFIG_GCOV_PROFILE_ALL=y is the only known case of this.)  Then the
    __unwind_start() skip logic hits an off-by-one bug and fails to unwind
    all the way to the intended starting frame.
    
    Fix it by reverting the following commit:
    
      f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks")
    
    The original justification for that commit no longer exists.  That
    original issue was later fixed in a different way, with the following
    commit:
    
      f2ac57a4c49d ("x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels")
    
    Fixes: f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks")
    Signed-off-by: Chen Zhongjin <chenzhongjin@xxxxxxxxxx>
    [jpoimboe: rewrite commit log]
    Signed-off-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
    Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index e64c5b78fbfd..350f40f9a0bf 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -579,7 +579,7 @@ void __unwind_start(struct unwind_state *state, struct task_struct *task,
 	/* Otherwise, skip ahead to the user-specified starting frame: */
 	while (!unwind_done(state) &&
 	       (!on_stack(&state->stack_info, first_frame, sizeof(long)) ||
-			state->sp < (unsigned long)first_frame))
+			state->sp <= (unsigned long)first_frame))
 		unwind_next_frame(state);
 
 	return;