Hi Michael, On Tue, May 23, 2023 at 3:11 AM Michael Schmitz <schmitzmic@xxxxxxxxx> wrote: > On 22/05/23 23:41, Geert Uytterhoeven wrote: > > On Sat, May 6, 2023 at 11:36 AM Finn Thain <fthain@xxxxxxxxxxxxxx> wrote: > >> On 68030/020, an instruction such as, moveml %a2-%a3/%a5,%sp@- may cause > >> a stack page fault during instruction execution (i.e. not at an > >> instruction boundary) and produce a format 0xB exception frame. > >> > >> In this situation, the value of USP will be unreliable. If a signal is to > >> be delivered following the exception, this USP value is used to calculate > >> the location for a signal frame. This can result in a corrupted user > >> stack. > >> > >> The corruption was detected in dash (actually in glibc) where it showed > >> up as an intermittent "stack smashing detected" message and crash > >> following signal delivery for SIGCHLD. > >> > >> It was hard to reproduce that failure because delivery of the signal > >> raced with the page fault and because the kernel places an unpredictable > >> gap of up to 7 bytes between the USP and the signal frame. > >> > >> A format 0xB exception frame can be produced by a bus error or an address > >> error. The 68030 Users Manual says that address errors occur immediately > >> upon detection during instruction prefetch. The instruction pipeline > >> allows prefetch to overlap with other instructions, which means an > >> address error can arise during the execution of a different instruction. > >> So it seems likely that this patch may help in the address error case also. > >> > >> Reported-and-tested-by: Stan Johnson <userm57@xxxxxxxxx> > >> Link: https://lore.kernel.org/all/CAMuHMdW3yD22_ApemzW_6me3adq6A458u1_F0v-1EYwK_62jPA@xxxxxxxxxxxxxx/ > >> Cc: Michael Schmitz <schmitzmic@xxxxxxxxx> > >> Cc: Andreas Schwab <schwab@xxxxxxxxxxxxxx> > >> Cc: stable@xxxxxxxxxxxxxxx > >> Co-developed-by: Michael Schmitz <schmitzmic@xxxxxxxxx> > >> Signed-off-by: Michael Schmitz <schmitzmic@xxxxxxxxx> > >> Signed-off-by: Finn Thain <fthain@xxxxxxxxxxxxxx> > > Reviewed-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> > > i.e. will queue as a fix in the m68k for-v6.4 branch. > > > > I plan to send this upstream later this week, so any additional > > testing would be appreciated. > > I've given this some lengthy stress testing, and haven't seen it fail once. > > In contrast, various attempts of mine to improve on the concept (by only > moving the signal frame away from the USP in case it's likely to clash) > sometimes came up against a kernel bus error in setup_frame() when > copying the signo to the signal frame. I must be making some incorrect > assumptions still ... > > Limiting the signal frame shift to bus fault exceptions that happen > mid-instruction is not too much of an overhead even in low memory > settings, and using 256 bytes (the largest possible operand size, i.e. > the largest adjustment to USP that might occur on completion of the > interrupted instruction) did not seem to cause any issues with stack > growth either. > > I can give this some more testing in ARAnyM (extending the stack shift > to format 7 frames) but I'd say it's got as much testing on 030 hardware > as we can do. Thank you for your continued testing! And thanks a lot to anyone involved in nailing this issue! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds