Re: core dump analysis, was Re: stack smashing detected

Michael Schmitz <schmitzmic@xxxxxxxxx> · Mon, 3 Apr 2023 20:26:32 +1200

Hi Finn,

Am 02.04.2023 um 21:31 schrieb Finn Thain:
On Sun, 2 Apr 2023, Michael Schmitz wrote:

Saved registers are restored from the stack before return from
__GI___wait4_time64 but we don't know which of the two wait4 call sites
was used, do we?

What registers does __m68k_read_tp@plt clobber?

But that won't matter to the caller, __wait3, right?

Not if they are properly restored ... best not go there.

I did check that %a3 was saved on entry, before any wait4 syscall or
__m68k_read_tp call etc. I also looked at the rts and %a3 did get restored
there. Is it worth the effort to trace every branch, in case there's some
way to reach an rts without having first restored the saved registers?

No, I dont think that's possible - from inspection, I now see 
__GI___wait4_time64 does not allow that, and I think the same is true 
for wait3 (haven't spent quite long enough on that).

Maybe an interaction between (multiple?) signals and syscall return...

When running dash from gdb in QEMU, there's only one signal (SIGCHLD) and
it gets handled before __wait3() returns. (Of course, the "stack smashing
detected" failure never shows up in QEMU.)

Might be a clue that we need multiple signals to force the stack 
smashing error. And we might not get that in QEMU, due to the faster 
execution in emulating on a modern processor.

Thinking a bit more about interactions between signal delivery and 
syscall return, it turns out that we don't check for pending signals 
when returning from a syscall. That's OK on SMP systems, because we 
don't have another process running while we execute the syscall (and we 
_do_ run signal handling when scheduling, i.e. when wait4 sleeps or is 
woken up)?

Seems we can forget about that interaction then.

depends on how long we sleep in wait4, and whether a signal happens just
during that time.

I agree, there seems to be a race condition there. (And dash's waitproc()
seems to take pains to reap the child and handle the signal in any order.)

Yes, it makes sure the SIGCHLD is seen no matter in what order the 
signals are delivered ...

I wouldn't be surprised if this race somehow makes the failure rare.

I don't want to recompile any userland binaries at this stage, so it would
be nice if we could modify the kernel to keep track of exactly how that
race gets won and lost. Or perhaps there's an easy way to rig the outcome
one way or the other.

A race between syscall return due to child exit and signal delivery 
seems unlikely, but maybe there is a race between syscall return due to 
a timer firing and signal delivery. Are there any timers set to 
periodically interrupt wait3?

%a3 is the first register saved to the switch stack BTW.

That kernel does contain Al Viro's patch that corrected our switch stack
handling in the signal return path? I wonder whether there's a potential
race lurking in there?

I'm not sure which patch you're referring to, but I think Al's signal
handling work appeared in v5.15-rc4. I have reproduced the "stack smashing

I have it in 5.15-rc2 in my tree but that's probably from my running 
tests on that patch series.

detected" failure with v5.14.0 and with recent mainline (62bad54b26db from
March 30th).

OK, so it's not related (or the patch did not fix all the problems with 
multiple signals, but a) that's unlikely and b) signals during wait4 
should not matter, see above).

So the fact that %a3 is involved here is probably just coincidence.

And I just notice that we had had trouble with a copy_to_user in
setup_frame() earlier (reason for my buserr handler patch). I wonder
whether something's gone wrong there. Do you get a segfault instead of
the abort signal if you drop my patch?

Are you referring to e36a82bebbf7? I doubt that it's related. I believe
that copy_to_user is not involved here for the reason already given i.e.
wait3(status, flags, NULL) means wait4 gets a NULL pointer for the struct
rusage * parameter. Also, Stan first reported this failure in December
with v6.0.9.

Can't be related then.

Still no nearer to a solution - something smashes the stack near %sp, 
causes the %a3 register restore after __GI___wait4_time64 to return a 
wrong pointer to the stack canary, and triggers a stack smashing warning 
in this indirect way. But what??

Cheers,

	Michael