Patch "sched/scs: Reset task stack state in bringup_cpu()" has been added to the 5.10-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    sched/scs: Reset task stack state in bringup_cpu()

to the 5.10-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     sched-scs-reset-task-stack-state-in-bringup_cpu.patch
and it can be found in the queue-5.10 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit a61e3acfadcda1c5cf7d953f99cc8412a9b80713
Author: Mark Rutland <mark.rutland@xxxxxxx>
Date:   Tue Nov 23 11:40:47 2021 +0000

    sched/scs: Reset task stack state in bringup_cpu()
    
    [ Upstream commit dce1ca0525bfdc8a69a9343bc714fbc19a2f04b3 ]
    
    To hot unplug a CPU, the idle task on that CPU calls a few layers of C
    code before finally leaving the kernel. When KASAN is in use, poisoned
    shadow is left around for each of the active stack frames, and when
    shadow call stacks are in use. When shadow call stacks (SCS) are in use
    the task's saved SCS SP is left pointing at an arbitrary point within
    the task's shadow call stack.
    
    When a CPU is offlined than onlined back into the kernel, this stale
    state can adversely affect execution. Stale KASAN shadow can alias new
    stackframes and result in bogus KASAN warnings. A stale SCS SP is
    effectively a memory leak, and prevents a portion of the shadow call
    stack being used. Across a number of hotplug cycles the idle task's
    entire shadow call stack can become unusable.
    
    We previously fixed the KASAN issue in commit:
    
      e1b77c92981a5222 ("sched/kasan: remove stale KASAN poison after hotplug")
    
    ... by removing any stale KASAN stack poison immediately prior to
    onlining a CPU.
    
    Subsequently in commit:
    
      f1a0a376ca0c4ef1 ("sched/core: Initialize the idle task with preemption disabled")
    
    ... the refactoring left the KASAN and SCS cleanup in one-time idle
    thread initialization code rather than something invoked prior to each
    CPU being onlined, breaking both as above.
    
    We fixed SCS (but not KASAN) in commit:
    
      63acd42c0d4942f7 ("sched/scs: Reset the shadow stack when idle_task_exit")
    
    ... but as this runs in the context of the idle task being offlined it's
    potentially fragile.
    
    To fix these consistently and more robustly, reset the SCS SP and KASAN
    shadow of a CPU's idle task immediately before we online that CPU in
    bringup_cpu(). This ensures the idle task always has a consistent state
    when it is running, and removes the need to so so when exiting an idle
    task.
    
    Whenever any thread is created, dup_task_struct() will give the task a
    stack which is free of KASAN shadow, and initialize the task's SCS SP,
    so there's no need to specially initialize either for idle thread within
    init_idle(), as this was only necessary to handle hotplug cycles.
    
    I've tested this on arm64 with:
    
    * gcc 11.1.0, defconfig +KASAN_INLINE, KASAN_STACK
    * clang 12.0.0, defconfig +KASAN_INLINE, KASAN_STACK, SHADOW_CALL_STACK
    
    ... offlining and onlining CPUS with:
    
    | while true; do
    |   for C in /sys/devices/system/cpu/cpu*/online; do
    |     echo 0 > $C;
    |     echo 1 > $C;
    |   done
    | done
    
    Fixes: f1a0a376ca0c4ef1 ("sched/core: Initialize the idle task with preemption disabled")
    Reported-by: Qian Cai <quic_qiancai@xxxxxxxxxxx>
    Signed-off-by: Mark Rutland <mark.rutland@xxxxxxx>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
    Reviewed-by: Valentin Schneider <valentin.schneider@xxxxxxx>
    Tested-by: Qian Cai <quic_qiancai@xxxxxxxxxxx>
    Link: https://lore.kernel.org/lkml/20211115113310.35693-1-mark.rutland@xxxxxxx/
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 67c22941b5f27..c06ced18f78ad 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -31,6 +31,7 @@
 #include <linux/smpboot.h>
 #include <linux/relay.h>
 #include <linux/slab.h>
+#include <linux/scs.h>
 #include <linux/percpu-rwsem.h>
 #include <linux/cpuset.h>
 
@@ -551,6 +552,12 @@ static int bringup_cpu(unsigned int cpu)
 	struct task_struct *idle = idle_thread_get(cpu);
 	int ret;
 
+	/*
+	 * Reset stale stack state from the last time this CPU was online.
+	 */
+	scs_task_reset(idle);
+	kasan_unpoison_task_stack(idle);
+
 	/*
 	 * Some architectures have to walk the irq descriptors to
 	 * setup the vector space for the cpu which comes online.
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e456cce772a3a..304aad997da11 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6523,9 +6523,6 @@ void __init init_idle(struct task_struct *idle, int cpu)
 	idle->se.exec_start = sched_clock();
 	idle->flags |= PF_IDLE;
 
-	scs_task_reset(idle);
-	kasan_unpoison_task_stack(idle);
-
 #ifdef CONFIG_SMP
 	/*
 	 * Its possible that init_idle() gets called multiple times on a task,
@@ -6681,7 +6678,6 @@ void idle_task_exit(void)
 		finish_arch_post_lock_switch();
 	}
 
-	scs_task_reset(current);
 	/* finish_cpu(), as ran on the BP, will clean up the active_mm state */
 }
 



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux