Re: signal delivery, was Re: reliable reproducer

Finn Thain <fthain@xxxxxxxxxxxxxx> · Sat, 29 Apr 2023 15:03:19 +1000 (AEST)

On Sat, 29 Apr 2023, Michael Schmitz wrote:

Am 29.04.2023 um 12:28 schrieb Finn Thain:

Right. If we fix this in the signal handling code, we take care of 
address errors as well, which was my concern with Andreas' patch. We 
can do what do_page_fault() does and assume the worst (256 bytes?).

Well, we could do that if we could be certain this does not cause a 
memory leak in some way. The reason I bring this up is that I've just 
seen the kernel that I'd used to run the latest test cases (which 
inserts a 20 byte gap only!) run amok terminating pretty much my entire 
user space because it ran out of memory. Never seen the like of that.

If the test program ran out of stack space it would not trigger the OOM 
killer. So that incident probably has something to do with upgrading your 
kernel (?)

Anyway, I agree that stkadj would need to account for the gap, as you 
pointed out earlier.

I believe we can use USP to get a worst case estimate for the future 
extent of the user stack. ...

What is the most data a moveml <...>,sp@- can take? If that's not too 
much, a constant offset for the signal stack in case of format b frames 
on 020/030 might be easiest.

I think it's 64 bytes (16 registers). But we also have to consider all of 
the other instructions that may write to the stack. There's probably a 
reason why do_page_fault() picked a 256 byte gap (?)

But we need to find something that works in the general case (and then 
analyze the performance impact it might have in stack and signal heavy 
applications - I might have mentioned that before, but your equivalent 
to Andreas' patch seemed quite a bit slower in the test case than when 
signals were allowed after format b bus faults. Interrupt latency, most 
likely).

The alternative is to use more stack memory, which means marginally more 
paging. Choose your poison...