Hi,
On 25.4.2023 4.55, Finn Thain wrote:
On Tue, 25 Apr 2023, Finn Thain wrote:
...
I wonder if we are seeing some fallout from the issue described in
do_page_fault() i.e. usp is unreliable.
/* Accessing the stack below usp is always a bug. The
"+ 256" is there due to some instructions doing
pre-decrement on the stack and that doesn't show up
until later. */
if (address + 256 < rdusp())
goto map_err;
Maybe we should try modifying get_sigframe() to increase the gap between
the signal and exception frames from 0-1 long words up to 64-65 long
words.
It turns out that doing so (patch below) does make the problem go away.
Was the exception frame getting clobbered?
diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
index b9f6908a31bc..94104699f5a8 100644
--- a/arch/m68k/kernel/signal.c
+++ b/arch/m68k/kernel/signal.c
@@ -862,7 +862,7 @@ get_sigframe(struct ksignal *ksig, size_t frame_size)
{
unsigned long usp = sigsp(rdusp(), ksig);
- return (void __user *)((usp - frame_size) & -8UL);
+ return (void __user *)((usp - 256 - frame_size) & -8UL);
}
static int setup_frame(struct ksignal *ksig, sigset_t *set,
While this is most likely Hatari emulation [1] issue, it has some of the
same triggering conditions, so I thought to mention it...
Above patch does not fix kernel panic I'm seen on booting Linux under
Hatari emulated Atari Falcon, to a small IDE root fs with just (old
Debian) Busybox and a shell script acting as init:
----------------------------------------
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
CPU: 0 PID: 1 Comm: sh Not tainted 6.2.0hatari-00006-g01793428cbc5-dirty #4
Stack from 00819dc8:
00819dc8 0032b646 0032b646 00027800 00000001 00819de8 00292878
0032b646
00819e0c 0028d60e 00027848 0000000b 0081caff 000300d6 00815a90
00000000
00819f2c 00819e50 00028878 003243fd 0000000b 0000000b 00818007
0081ca30
000300d6 0081caf8 00000001 00819f18 0081bbd0 00819f2c 00000000
00000000
00000000 00000000 00819e60 00028ea0 0000000b 0000000b 00819e98
000303c8
0000000b 00818000 00000002 00000000 00000000 00000000 8017705e
00819f78
Call Trace: [<00027800>] set_cpu_online+0x1c/0x3e
[<00292878>] dump_stack+0x10/0x16
[<0028d60e>] panic+0xc4/0x22a
[<00027848>] arch_local_irq_enable+0x0/0x22
[<000300d6>] do_signal_stop+0x0/0x152
[<00028878>] do_exit+0x138/0x642
[<000300d6>] do_signal_stop+0x0/0x152
[<00028ea0>] do_group_exit+0x22/0x62
[<000303c8>] get_signal+0xf8/0x4ba
[<00003508>] test_ti_thread_flag+0x0/0x1a
[<00003f4a>] do_notify_resume+0x36/0x488
[<00005706>] send_fault_sig+0x28/0x8c
[<00005888>] do_page_fault+0x11e/0x242
[<00005814>] do_page_fault+0xaa/0x242
[<00002814>] do_signal_return+0x10/0x1a
[<00020007>] _I_CALL_TOP+0xd83/0x1900
[<0000b280>] sp_over+0x2c/0x3c
[<00007201>] atari_irq_enable+0x3/0x2a
[<000066f6>] atari_get_hardware_list+0x33a/0x3e8
----------------------------------------
(Only way to get rid of the panic is disabling both CPU cache and
prefetch emulation.)
Is it possible that in your case there's also IRQ (exception) happening
at the same time with page fault and signal?
- Eero
[1] 030 MMU vs. cache/prefetch vs. exception handling vs. IDE emulation.