Re: [RFC] m68k: Fix dead code in bus_error030()

Finn Thain <fthain@xxxxxxxxxxxxxxxxxxx> · Sat, 10 Mar 2018 09:51:04 +1100 (AEDT)

On Fri, 9 Mar 2018, Michael Schmitz wrote:

How does the PDMA logic raise the exception? If we find none of the 
usual MMU status register bits are set, we could take that as an 
indication that the exception wasn't raised by the MMU, so no page or 
protection fault. Pretty much leaves only the PDMA logic (if present).

It's a hardware exception, not a software exception. The bus error is 
generated by signals from one of Apple's ASICs. This logic circuit 
effectively interfaces the SCSI bus with the system bus, via the SCSI 
controller, for performance. But that's hardly relevant. I'm more 
interested in the bug in mainline, not the bug in my RFC patch.

- What are the implications of the existing logic error?

We might miss handling (MMU_B|MMU_L|MMU_S) && (ssw & RM) (should be 
harmless),

I think that leads to a recursive fault which would kill the machine 
instead of just the user process that caused the fault, but I don't have 
code to confirm this.

and might log (MMU_B|MMU_L|MMU_S) && !(ssw & RM) as unexpected bus error 
if there wasn't a user process to signal or an exception vector to fix 
up, or even panic. Doesn't seem to happen though.

- What was the author trying to achieve? Why a special case for RMW 
  faults?

Because the 020 can't do RMW instructions,

I don't think that's correct.

so these faults have to be expected? (My memory might be tripping me up 
there, though..)

- Should the dead code be deleted because the live algorithm cannot be 
  improved upon? (The present algorithm works fine for PDMA for example.)

The end result might be the same with the current code (except for PDMA 
working): signal to user process (maybe the wrong one; weird access 
forces SEGV), or panic. The main difference is we don't fix up the 
exception from process exception tables. I believe that is what makes 
PDMA work, i.e. fixes up the PDMA bus fault?

I don't follow. Deleting dead code means no difference. Everything works 
the same (and so PDMA keeps working, but that's a red herring).

So perhaps try send_fault_signal() in the default branch, which will 
also run die_if_kernel() if need be.

I tried that (Stan tested it). It turns out that usermode instruction 
faults can also traverse that branch so /sbin/init just crashed. I think I 
can resolve that. But there's not much point in pursuing that until the 
architecure experts agree that there's a bug to be fixed.

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-m68k" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html