On Fri, 9 Mar 2018, Michael Schmitz wrote:
How does the PDMA logic raise the exception? If we find none of the
usual MMU status register bits are set, we could take that as an
indication that the exception wasn't raised by the MMU, so no page or
protection fault. Pretty much leaves only the PDMA logic (if present).
It's a hardware exception, not a software exception. The bus error is
generated by signals from one of Apple's ASICs. This logic circuit
effectively interfaces the SCSI bus with the system bus, via the SCSI
controller, for performance. But that's hardly relevant. I'm more
interested in the bug in mainline, not the bug in my RFC patch.
- What are the implications of the existing logic error?
We might miss handling (MMU_B|MMU_L|MMU_S) && (ssw & RM) (should be
harmless),
I think that leads to a recursive fault which would kill the machine
instead of just the user process that caused the fault, but I don't have
code to confirm this.
and might log (MMU_B|MMU_L|MMU_S) && !(ssw & RM) as unexpected bus error
if there wasn't a user process to signal or an exception vector to fix
up, or even panic. Doesn't seem to happen though.
- What was the author trying to achieve? Why a special case for RMW
faults?
Because the 020 can't do RMW instructions,
I don't think that's correct.
so these faults have to be expected? (My memory might be tripping me up
there, though..)
- Should the dead code be deleted because the live algorithm cannot be
improved upon? (The present algorithm works fine for PDMA for example.)
The end result might be the same with the current code (except for PDMA
working): signal to user process (maybe the wrong one; weird access
forces SEGV), or panic. The main difference is we don't fix up the
exception from process exception tables. I believe that is what makes
PDMA work, i.e. fixes up the PDMA bus fault?
I don't follow. Deleting dead code means no difference. Everything works
the same (and so PDMA keeps working, but that's a red herring).
So perhaps try send_fault_signal() in the default branch, which will
also run die_if_kernel() if need be.
I tried that (Stan tested it). It turns out that usermode instruction
faults can also traverse that branch so /sbin/init just crashed. I think I
can resolve that. But there's not much point in pursuing that until the
architecure experts agree that there's a bug to be fixed.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-m68k" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html