Re: signal delivery, was Re: reliable reproducer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Finn,

Am 29.04.2023 um 17:03 schrieb Finn Thain:
On Sat, 29 Apr 2023, Michael Schmitz wrote:

Am 29.04.2023 um 12:28 schrieb Finn Thain:

Right. If we fix this in the signal handling code, we take care of
address errors as well, which was my concern with Andreas' patch. We
can do what do_page_fault() does and assume the worst (256 bytes?).

Well, we could do that if we could be certain this does not cause a
memory leak in some way. The reason I bring this up is that I've just
seen the kernel that I'd used to run the latest test cases (which
inserts a 20 byte gap only!) run amok terminating pretty much my entire
user space because it ran out of memory. Never seen the like of that.


If the test program ran out of stack space it would not trigger the OOM
killer. So that incident probably has something to do with upgrading your
kernel (?)

Might be, but this has all been on m68k v6.3-rc7, and I hadn't seen the memory squeeze before there. I'll have to run a few hundred of the test case on an unpatched v6.3-rc7 and on the one with the minimal frame gap to be sure though.

Anyway, I agree that stkadj would need to account for the gap, as you
pointed out earlier.

Not sure about that anymore - mangle_kernel_stack() does not even use stkadj to shift contents on the kernel stack (after restoring the exception frame from the signal stack, but it uses the start address of the frame for that copy operation, and uses a local buffer to move it from user space to kernel space). It uses the extra frame size from the exception frame directly.

stkadj is the offset of the replacement exception frame on the kernel stack. The replacement frame gets us into the user space signal handler instead of completing the exception right away. stkadj is used to skip that replacement exception frame used for the signal handler on the final rte (after a trip through sys_sigreturn to copy the original exception frame back on the kernel stack).

The offset we use for he signal stack on the user stack does not matter here at all.

Or so my limited understanding...


I believe we can use USP to get a worst case estimate for the future
extent of the user stack. ...

What is the most data a moveml <...>,sp@- can take? If that's not too
much, a constant offset for the signal stack in case of format b frames
on 020/030 might be easiest.


I think it's 64 bytes (16 registers). But we also have to consider all of
the other instructions that may write to the stack. There's probably a
reason why do_page_fault() picked a 256 byte gap (?)

That's not used as a gap, just to catch any user access below the user stack pointer.


But we need to find something that works in the general case (and then
analyze the performance impact it might have in stack and signal heavy
applications - I might have mentioned that before, but your equivalent
to Andreas' patch seemed quite a bit slower in the test case than when
signals were allowed after format b bus faults. Interrupt latency, most
likely).


The alternative is to use more stack memory, which means marginally more
paging. Choose your poison...

Yes - I'll have to run a few benchmarks to see which I'd prefer.

In the meantime, I'll send what I have at present as RFC.

Cheers,

	Michael



[Index of Archives]     [Video for Linux]     [Yosemite News]     [Linux S/390]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux