On Thu, Apr 11, 2024 at 03:36:31PM +1200, Michael Schmitz wrote:
Context switching does take care to retain the correct lock owner across the switch from 'prev' to 'next' tasks. This does rely on interrupts remaining disabled for the entire duration of the switch. This condition is guaranteed for normal process creation and context switching between already running processes, because both 'prev' and 'next' already have interrupts disabled in their saved copies of the status register. The situation is different for newly created kernel threads. The status register is set to PS_S in copy_thread(), which does leave the IPL at 0. Upon restoring the 'next' thread's status register in switch_to() aka resume(), interrupts then become enabled prematurely. resume() then returns via ret_from_kernel_thread() and schedule_tail() where run queue lock is released (see finish_task_switch() and finish_lock_switch()). A timer interrupt calling scheduler_tick() before the lock is released in finish_task_switch() will find the lock already taken, with the current task as lock owner. This causes a spinlock recursion warning as reported by Guenter Roeck. As far as I can ascertain, this race has been opened in commit 533e6903bea0 ("m68k: split ret_from_fork(), simplify kernel_thread()") but I haven't done a detailed study of kernel history so it may well predate that commit. Interrupts cannot be disabled in the saved status register copy for kernel threads (init will complain about interrupts disabled when finally starting user space). Disable interrupts temporarily when switching the tasks' register sets in resume(). Note that a simple oriw 0x700,%sr after restoring sr is not enough here - this leaves enough of a race for the 'spinlock recursion' warning to still be observed. Tested on ARAnyM and qemu (Quadra 800 emulation). Cc: Guenter Roeck <linux@xxxxxxxxxxxx> Cc: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> Cc: Finn Thain <fthain@xxxxxxxxxxxxxx> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Cc: stable@xxxxxxxxxx # 3.6 Cc: linux-m68k@xxxxxxxxxxxxxxxxxxxx Link: https://lore.kernel.org/all/07811b26-677c-4d05-aeb4-996cd880b789@xxxxxxxxxxxx Fixes: 533e6903bea0 ("m68k: split ret_from_fork(), simplify kernel_thread()") Signed-off-by: Michael Schmitz <schmitzmic@xxxxxxxxx>
Tested-by: Guenter Roeck <linux@xxxxxxxxxxxx> With this patch applied, the problem was not observed over more than 1,500 test runs. Without this patch, the problem is seen roughly once in ~100 test runs. Tested with qemu's q800 emulation. Guenter
--- arch/m68k/kernel/entry.S | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/m68k/kernel/entry.S b/arch/m68k/kernel/entry.S index 9933679ea28b..681c008270dc 100644 --- a/arch/m68k/kernel/entry.S +++ b/arch/m68k/kernel/entry.S @@ -446,7 +446,9 @@ resume: movec %a0,%dfc /* restore status register */ - movew %a1@(TASK_THREAD+THREAD_SR),%sr + movew %a1@(TASK_THREAD+THREAD_SR),%d0 + oriw #0x0700,%d0 + movew %d0,%sr rts -- 2.17.1