On Sun, Aug 22, 2021 at 12:34 PM Michael Schmitz <schmitzmic@xxxxxxxxx> wrote:
Got this overnight:
[536154.200000] *** FORMAT ERROR *** FORMAT=0
[536154.210000] Current process id is 4656
[536154.230000] BAD KERNEL TRAP: 00000000
[536154.240000] Modules linked in: atari_scsi ne 8390p [last unloaded: atari_scsi]
[536154.260000] PC: [<00002a8c>] resume_userspace+0x14/0x16
[536154.270000] SR: 2208 SP: 977bd1be a2: 8009b5e8
[536154.290000] d0: 8009b5e8 d1: cfcfcfcf d2: 00000000 d3: ffffffff
[536154.300000] d4: 00000000 d5: 00000000 a0: 8008a108 a1: 8009b7df
[536154.320000] Process savelog (pid: 4656, task=e49aa246)
[536154.330000] Frame format=0
[536154.340000] Stack from 00cc5fa4:
[536154.340000] 02088004 3666b008 1c0eb209 007eb5e8 8006a2d0 efaec378 8004366c 61ff61ff
[536154.340000] 8006a2d4 8006a2d2 00000000 030dfffb 0044fffa 0e000000 fffa1a00 fffa1c00
[536154.340000] fffa1e00 fffb0e40 fffb0e80 00049b66 00000040 005f5800 00000001
Strange. If I read that stack frame correctly, that seems to be an
exception frame of type 0xb ("Long Bus Cycle").
Plus the frame content is then apparently corrupted enough that the
rte causes an exception on trying to restore it.
None of which makes sense or seems to have much at all to do with any
of these patches. Yes, we mess with the exception frame, but only for
fork(), and while "copy_process()" doesn't set any frame type, I see
only two cases:
- the kernel thread one does a "memset()" to clear it, so you should
end up with frame type 0
- the user thread case copies the original frame format (which I
think is just the system call frame from the TRAP instruction).
Are you 100% sure your hardware is stable?
Linus