Re: core dump analysis, was Re: stack smashing detected

Finn Thain <fthain@xxxxxxxxxxxxxx> · Tue, 4 Apr 2023 14:05:06 +1000 (AEST)

On Mon, 3 Apr 2023, Michael Schmitz wrote:

Am 02.04.2023 um 21:31 schrieb Finn Thain:

Maybe an interaction between (multiple?) signals and syscall 
return...

When running dash from gdb in QEMU, there's only one signal (SIGCHLD) 
and it gets handled before __wait3() returns. (Of course, the "stack 
smashing detected" failure never shows up in QEMU.)

Might be a clue that we need multiple signals to force the stack 
smashing error. And we might not get that in QEMU, due to the faster 
execution in emulating on a modern processor.

Right -- being that the failure is intermittent on real hardware, it's not 
surprising that I can't make it show up in QEMU or Aranym.

But no-one has reproduced it on Atari or Amiga hardware yet so I guess it 
could be a driver issue...

I wonder whether anyone else is actually running recent Debian/SID with 
sysvinit and without a Debian initrd on a Motorola 68030 system.

Thinking a bit more about interactions between signal delivery and 
syscall return, it turns out that we don't check for pending signals 
when returning from a syscall. That's OK on SMP systems, because we 
don't have another process running while we execute the syscall (and we 
_do_ run signal handling when scheduling, i.e. when wait4 sleeps or is 
woken up)?

Seems we can forget about that interaction then.

depends on how long we sleep in wait4, and whether a signal happens 
just during that time.

I agree, there seems to be a race condition there. (And dash's 
waitproc() seems to take pains to reap the child and handle the signal 
in any order.)

Yes, it makes sure the SIGCHLD is seen no matter in what order the 
signals are delivered ...

I wouldn't be surprised if this race somehow makes the failure rare.

I don't want to recompile any userland binaries at this stage, so it 
would be nice if we could modify the kernel to keep track of exactly 
how that race gets won and lost. Or perhaps there's an easy way to rig 
the outcome one way or the other.

A race between syscall return due to child exit and signal delivery 
seems unlikely, but maybe there is a race between syscall return due to 
a timer firing and signal delivery. Are there any timers set to 
periodically interrupt wait3?

I searched the source code and SIGALRM appears to be unused by dash. And 
'timeout' is not a dash builtin. But that doesn't mean we don't get 
multiple signals. One crashy script looks like this:

TMPFS_SIZE="$(tmpfs_size_vm "$TMPFS_SIZE")"
RUN_SIZE="$(tmpfs_size_vm "$RUN_SIZE")"
LOCK_SIZE="$(tmpfs_size_vm "$LOCK_SIZE")"
SHM_SIZE="$(tmpfs_size_vm "$SHM_SIZE")"
TMP_SIZE="$(tmpfs_size_vm "$TMP_SIZE")"

Is it possible that the SIGCHLD from the first sub-shell got delayed?

Still no nearer to a solution - something smashes the stack near %sp, 
causes the %a3 register restore after __GI___wait4_time64 to return a 
wrong pointer to the stack canary, and triggers a stack smashing warning 
in this indirect way. But what??

I've no idea.

The actual corruption might offer a clue here. I believe the saved %a3 was 
clobbered with the value 0xefee1068 which seems to be a pointer into some 
stack frame that would have come into existence shortly after 
__GI___wait4_time64 was called. That stack frame is gone by the time the 
core dump was made. Was it dash's signal handler, onsig(), or some libc 
subroutine called by __GI___wait4_time64(), or was it something that the 
kernel put there?

Dash's SIGCHLD handler looks safe enough -- I don't see how it could 
corrupt the saved registers in the __GI___wait4_time64 stack frame (it's 
not like 1 was stored in the wrong place). 
https://sources.debian.org/src/dash/0.5.12-2/src/trap.c/?hl=285#L285