Hi Andreas,
On 26/04/23 07:46, Michael Schmitz wrote:
Hi Andreas,
On 25/04/23 23:25, Andreas Schwab wrote:
On Apr 25 2023, Finn Thain wrote:
It turns out that doing so (patch below) does make the problem go away.
Was the exception frame getting clobbered?
diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
index b9f6908a31bc..94104699f5a8 100644
--- a/arch/m68k/kernel/signal.c
+++ b/arch/m68k/kernel/signal.c
@@ -862,7 +862,7 @@ get_sigframe(struct ksignal *ksig, size_t
frame_size)
{
unsigned long usp = sigsp(rdusp(), ksig);
- return (void __user *)((usp - frame_size) & -8UL);
+ return (void __user *)((usp - 256 - frame_size) & -8UL);
Probably the issue is that a bus error exception should never start
signal delivery when returning to user space. On the 030 returning from
a bus error resumes the execution of the faulting insn (unlike the
040/060 which restart it), and the saved USP may have the original value
from before the insn started (modified registers may not be updated
until the insn is complete or just before the final bus cycle). Signal
delivery should only ever happen at insn boundaries.
Thanks - we had seen evidence that a bus error generated
mid-instruction did leave the USP at the address where the bus fault
happened (not before the instruction started, neither what it would
have been once the instruction completed), and the operation did not
complete normally after the bus error (at least the value/address seen
in the exception frame not stored). Finn had also demonstrated that
skipping signal delivery on bus errors abolishes the stack
corruption. Your patch achieves the same objective in a different
way, so I'm sure this will work as well.
I had thought the 030 could resume the interrupted instruction using
the information from the exception frame - and that does appear to
work in all other cases except where signal delivery gets in the way,
and it also works if moving the exception frame a little bit further
down the stack. So our treatment of the bus error exception frame
during signal delivery appears to be incorrect. Wouldn't you agree?
Inspection of the format b frame placed in the signal frame in both rt
and non-rt cases (at the time the signal handler runs) shows the
expected contents in the data output buffer, data fault address and ssw.
At that time, returning to user space with rte would correctly resume
the instruction execution. I had previously confirmed that the register
contents saved in the rt signal frame is correct also.
That is with a kernel patched similar to above patch by Finn (using an
offset of 128 or 64 instead of 256).
Cheers,
Michael