Re: [PATCH] mm: Check for SIGKILL inside dup_mmap() loop.

Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> · Thu, 29 Mar 2018 14:30:03 -0700

On Thu, 29 Mar 2018 20:27:50 +0900 Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:

> Theoretically it is possible that an mm_struct with 60000+ vmas loops
> with potentially allocating memory, with mm->mmap_sem held for write by
> the current thread. Unless I overlooked that fatal_signal_pending() is
> somewhere in the loop, this is bad if current thread was selected as an
> OOM victim, for the current thread will continue allocations using memory
> reserves while the OOM reaper is unable to reclaim memory.

All of which implies to me that this patch fixes a problem which is not
known to exist!  

> But there is no point with continuing the loop from the beginning if
> current thread is killed. If there were __GFP_KILLABLE (or something
> like memalloc_nofs_save()/memalloc_nofs_restore()), we could apply it
> to all allocations inside the loop. But since we don't have such flag,
> this patch uses fatal_signal_pending() check inside the loop.

Dumb question: if a thread has been oom-killed and then tries to
allocate memory, should the page allocator just fail the allocation
attempt?  I suppose there are all sorts of reasons why not :(

In which case, yes, setting a new
PF_MEMALLOC_MAY_FAIL_IF_I_WAS_OOMKILLED around such code might be a
tidy enough solution.  It would be a bit sad to add another test in the
hot path (should_fail_alloc_page()?), but geeze we do a lot of junk
already.

> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -440,6 +440,10 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
>  			continue;
>  		}
>  		charge = 0;
> +		if (fatal_signal_pending(current)) {
> +			retval = -EINTR;
> +			goto out;
> +		}
>  		if (mpnt->vm_flags & VM_ACCOUNT) {
>  			unsigned long len = vma_pages(mpnt);

I think a comment explaining why we're doing this would help.

Better would be to add a new function "current_is_oom_killed()" or
such, which becomes self-documenting.  Because there are other reasons
why a task may have a fatal signal pending.