Re: signal delivery, was Re: reliable reproducer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 25.4.2023 4.55, Finn Thain wrote:
On Tue, 25 Apr 2023, Finn Thain wrote:
...
I wonder if we are seeing some fallout from the issue described in
do_page_fault() i.e. usp is unreliable.

                 /* Accessing the stack below usp is always a bug.  The
                    "+ 256" is there due to some instructions doing
                    pre-decrement on the stack and that doesn't show up
                    until later.  */
                 if (address + 256 < rdusp())
                         goto map_err;

Maybe we should try modifying get_sigframe() to increase the gap between
the signal and exception frames from 0-1 long words up to 64-65 long
words.

It turns out that doing so (patch below) does make the problem go away.
Was the exception frame getting clobbered?

diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
index b9f6908a31bc..94104699f5a8 100644
--- a/arch/m68k/kernel/signal.c
+++ b/arch/m68k/kernel/signal.c
@@ -862,7 +862,7 @@ get_sigframe(struct ksignal *ksig, size_t frame_size)
  {
  	unsigned long usp = sigsp(rdusp(), ksig);
- return (void __user *)((usp - frame_size) & -8UL);
+	return (void __user *)((usp - 256 - frame_size) & -8UL);
  }
static int setup_frame(struct ksignal *ksig, sigset_t *set,

While this is most likely Hatari emulation [1] issue, it has some of the same triggering conditions, so I thought to mention it...

Above patch does not fix kernel panic I'm seen on booting Linux under Hatari emulated Atari Falcon, to a small IDE root fs with just (old Debian) Busybox and a shell script acting as init:
----------------------------------------
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
CPU: 0 PID: 1 Comm: sh Not tainted 6.2.0hatari-00006-g01793428cbc5-dirty #4
Stack from 00819dc8:
00819dc8 0032b646 0032b646 00027800 00000001 00819de8 00292878 0032b646 00819e0c 0028d60e 00027848 0000000b 0081caff 000300d6 00815a90 00000000 00819f2c 00819e50 00028878 003243fd 0000000b 0000000b 00818007 0081ca30 000300d6 0081caf8 00000001 00819f18 0081bbd0 00819f2c 00000000 00000000 00000000 00000000 00819e60 00028ea0 0000000b 0000000b 00819e98 000303c8 0000000b 00818000 00000002 00000000 00000000 00000000 8017705e 00819f78
Call Trace: [<00027800>] set_cpu_online+0x1c/0x3e
 [<00292878>] dump_stack+0x10/0x16
 [<0028d60e>] panic+0xc4/0x22a
 [<00027848>] arch_local_irq_enable+0x0/0x22
 [<000300d6>] do_signal_stop+0x0/0x152
 [<00028878>] do_exit+0x138/0x642
 [<000300d6>] do_signal_stop+0x0/0x152
 [<00028ea0>] do_group_exit+0x22/0x62
 [<000303c8>] get_signal+0xf8/0x4ba
 [<00003508>] test_ti_thread_flag+0x0/0x1a
 [<00003f4a>] do_notify_resume+0x36/0x488
 [<00005706>] send_fault_sig+0x28/0x8c
 [<00005888>] do_page_fault+0x11e/0x242
 [<00005814>] do_page_fault+0xaa/0x242
 [<00002814>] do_signal_return+0x10/0x1a
 [<00020007>] _I_CALL_TOP+0xd83/0x1900
 [<0000b280>] sp_over+0x2c/0x3c
 [<00007201>] atari_irq_enable+0x3/0x2a
 [<000066f6>] atari_get_hardware_list+0x33a/0x3e8
----------------------------------------

(Only way to get rid of the panic is disabling both CPU cache and prefetch emulation.)


Is it possible that in your case there's also IRQ (exception) happening at the same time with page fault and signal?


	- Eero

[1] 030 MMU vs. cache/prefetch vs. exception handling vs. IDE emulation.



[Index of Archives]     [Video for Linux]     [Yosemite News]     [Linux S/390]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux