On Sun, 23 Apr 2023, Michael Schmitz wrote:
Am 23.04.2023 um 13:41 schrieb Michael Schmitz: Though the question remains - is this expected behaviour for programs that do deep recursion on the stack while taking signals (and the reason for the option to run signal handlers on an alternate stack)?
I don't understand how "deep recursion" can be used to explain this. We've seen crashes with only 1.8 MB of stack usage. The best reason I can think of for having a signal stack would be that it may be better for signal delivery to fail than for the target process to fail. But I've no idea whether the kernel makes that kind of defensive programming possible (?)
And why does this almost always appear to happen after bus error exceptions (frame format b)? The extra exception stack information isn't even accounted for in the above frame end address! Result with sa_sigaction handler: parent usp : 0xef969e28 handler tos : 0xef969e6c handler stack overwrote usp! frame end : 0xef969e7c frame start : 0xef969b58 handler usp : 0xef969b40 signal usp : 0xef969e04 signal pc : 0x80000696 signal fmtv : 0x114 parent usp : 0xef955008 handler tos : 0xef955064 handler stack overwrote usp! frame end : 0xef955074 frame start : 0xef954d50 handler usp : 0xef954d38 signal usp : 0xef954ffc signal pc : 0x80000680 signal fmtv : 0xb008 parent usp : 0xef945eb8 handler tos : 0xef945f0c handler stack overwrote usp! frame end : 0xef945f1c frame start : 0xef945bf8 handler usp : 0xef945be0 signal usp : 0xef945ea8 signal pc : 0xc009f37a signal fmtv : 0x80 parent usp : 0xef933eb8 handler tos : 0xef933f0c handler stack overwrote usp! frame end : 0xef933f1c frame start : 0xef933bf8 handler usp : 0xef933be0 signal usp : 0xef933ea8 signal pc : 0xc009f37a signal fmtv : 0x80 parent usp : 0xef921edc handler tos : 0xef9aaca4 handler stack overwrote usp! frame end : 0xef9aacb4 frame start : 0xef9aa990 handler usp : 0xef9aa978 signal usp : 0xef9aac40 signal pc : 0x80000782 signal fmtv : 0x114 Illegal instruction (core dumped)
I don't understand these results. If usp was really overwritten, the program would have crashed early, no?
Exception right before crash was an interrupt in this case (only seen that once in this context, though I've seen lots of those in the course of the test runs). Frame start calculated from siginfo pointer value in this case.
I didn't realize that you could get a crash from a signal delivered following an interrupt. I'll try to modify the kernel such that signals are not delivered after page faults.