On Wed, 19 Apr 2023, Michael Schmitz wrote:
I wonder what we'd see if we patched the kernel to log every user data
write fault caused by a MOVEM instruction. I'll try to code that up.
If these instructions did always cause stack corruption on 030, I think
we would have noticed long ago?
I think it probably was noticed long ago, in the form of rare userland
crashes on 68030. But it was probably never reported because the actual
culprit is too distant from the symptoms.
But I take your point -- signal delivery seems to be crucial. Would it be
difficult to skip signal delivery following a bus error? Perhaps there's
no need to try that experiment, as we know what would happen.
I will take a look at your modified test program and try to use the output
to figure out the stack gymnastics.
IIUC, there are two RTEs following the page fault. The first one runs the
signal handler, the second one resumes the MOVEM that faulted. Maybe we'll
have to intercept the latter (at do_sigreturn() perhaps?) and examine that
exception frame.