Re: segfaults of processes while being killed after commit "mm: make the page fault mmap locking killable"

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Tue, 25 Jul 2023 09:38:32 -0700

On Tue, 25 Jul 2023 at 04:16, Fiona Ebner <f.ebner@xxxxxxxxxxx> wrote:
>
> will end up without a vma and cause/log the segfault. Of course the
> process is already being killed, but I'd argue it is very confusing to
> users when apparent segfaults from such processes are being logged by
> the kernel.

Ahh. Yes, that wasn't the intent. A process that is being killed
should exit with the lethal signal, not SIGSEGV.

This is not new, btw - this situation is exactly the same as if you
use something like NFS where the process is killed and the IO is
interrupted and the page fault faults for that reason.

But I suspect *very* few people actually encounter that NFS situation
(you can get it on local filesystems too, but the IO race window is
then so small as to probably not be triggerable at all).

So the new killable() check is probably much easier to actually
trigger in practice, even though it's not a new situation per se.

What exactly made you notice? Is it just the logging from
'show_unhandled_signals' being set?

Because the actual signal itself, from the

        force_sig_fault(SIGSEGV, si_code, (void __user *)address);

in __bad_area_nosemaphore() should be overridden by the fact that a
lethal signal was already pending.

But let's add a couple of signal people rather than the mm people to
the participants. Eric, Oleg - would not an existing fatal signal take
precedence over a new SIGSEGV? I obviously thought it did, but looking
at 'get_signal()' and the signal delivery, I don't actually see any
code to that effect.

Fiona - that patch is easily reverted, and it was done as a separate
patch exactly because I was wondering if there was some subtle reason
we didn't already do that.

But before we revert it, would you mind trying out the attached
trivial patch instead?

I'd also still be interested if the symptoms were anything else than
'show_unhandled_signals' causing the show_signal_msg() dance, and
resulting in a message something like

    a.out[1567]: segfault at xyz ip [..] likely on CPU X

in dmesg...

                 Linus
 arch/x86/mm/fault.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index e8711b2cafaf..b4a0290e963c 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -831,6 +831,10 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
 	 */
 	local_irq_enable();
 
+	/* If a fatal signal is pending, don't bother with anything else */
+	if (fatal_signal_pending())
+		return;
+
 	/*
 	 * Valid to do another page fault here because this one came
 	 * from user space: