On Tue, 25 Jul 2023 at 04:16, Fiona Ebner <f.ebner@xxxxxxxxxxx> wrote: > > will end up without a vma and cause/log the segfault. Of course the > process is already being killed, but I'd argue it is very confusing to > users when apparent segfaults from such processes are being logged by > the kernel. Ahh. Yes, that wasn't the intent. A process that is being killed should exit with the lethal signal, not SIGSEGV. This is not new, btw - this situation is exactly the same as if you use something like NFS where the process is killed and the IO is interrupted and the page fault faults for that reason. But I suspect *very* few people actually encounter that NFS situation (you can get it on local filesystems too, but the IO race window is then so small as to probably not be triggerable at all). So the new killable() check is probably much easier to actually trigger in practice, even though it's not a new situation per se. What exactly made you notice? Is it just the logging from 'show_unhandled_signals' being set? Because the actual signal itself, from the force_sig_fault(SIGSEGV, si_code, (void __user *)address); in __bad_area_nosemaphore() should be overridden by the fact that a lethal signal was already pending. But let's add a couple of signal people rather than the mm people to the participants. Eric, Oleg - would not an existing fatal signal take precedence over a new SIGSEGV? I obviously thought it did, but looking at 'get_signal()' and the signal delivery, I don't actually see any code to that effect. Fiona - that patch is easily reverted, and it was done as a separate patch exactly because I was wondering if there was some subtle reason we didn't already do that. But before we revert it, would you mind trying out the attached trivial patch instead? I'd also still be interested if the symptoms were anything else than 'show_unhandled_signals' causing the show_signal_msg() dance, and resulting in a message something like a.out[1567]: segfault at xyz ip [..] likely on CPU X in dmesg... Linus
arch/x86/mm/fault.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index e8711b2cafaf..b4a0290e963c 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -831,6 +831,10 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code, */ local_irq_enable(); + /* If a fatal signal is pending, don't bother with anything else */ + if (fatal_signal_pending()) + return; + /* * Valid to do another page fault here because this one came * from user space: