Thanks Stan,
On 8/02/23 08:37, Stan Johnson wrote:
Hi Michael,
On 2/5/23 3:19 PM, Michael Schmitz wrote:
...
Seeing Finn's report that Al Viro's VM_FAULT_RETRY fix may have solved
his task corruption troubles on 040, I just noticed that I probably
misunderstood how Al's patch works.
Botching up a fault retry and carrying on may well leave the page tables
in a state where some later access could go to the wrong page and
manifest as user space corruption. Could you try Al's patch 4 (m68k: fix
livelock in uaccess) to see if this helps?
...
ok, this appears to be the patch:
Signed-off-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
---
arch/m68k/mm/fault.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/m68k/mm/fault.c b/arch/m68k/mm/fault.c
index 4d2837eb3e2a..228128e45c67 100644
--- a/arch/m68k/mm/fault.c
+++ b/arch/m68k/mm/fault.c
@@ -138,8 +138,11 @@ int do_page_fault(struct pt_regs *regs, unsigned
long address,
fault = handle_mm_fault(vma, address, flags, regs);
pr_debug("handle_mm_fault returns %x\n", fault);
- if (fault_signal_pending(fault, regs))
+ if (fault_signal_pending(fault, regs)) {
+ if (!user_mode(regs))
+ goto no_context;
return 0;
+ }
/* The fault is fully completed (including releasing mmap lock) */
if (fault & VM_FAULT_COMPLETED)
That's correct.
Your results show improvement but the problem does not entirely go away.
Looking at differences between 030 and 040/040 fault handling, it
appears only 030 handles faults corrected by exception tables (such as
used in uaccess macros) special, i.e. aborting bus error processing
while 040 and 060 carry on in the fault handler.
I wonder if that's the main difference between 030 and 040 behaviour?
I'll try and log such accesses caught by exception tables on 030 to see
if they are rare enough to allow adding a kernel log message...
Cheers,
Michael