Re: linux-4.4-rc1: TIF_MEMDIE without SIGKILL pending?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon 23-11-15 20:06:02, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Sun 22-11-15 21:13:22, Tetsuo Handa wrote:
> > > I was updating kmallocwd in preparation for testing "[RFC 0/3] OOM detection
> > > rework v2" patchset. I noticed an unexpected result with linux.git as of
> > > 3ad5d7e06a96 .
> > > 
> > > The problem is that an OOM victim arrives at do_exit() with TIF_MEMDIE flag
> > > set but without pending SIGKILL. Is this correct behavior?
> > 
> > Have a look at out_of_memory where we do:
> >         /*
> >          * If current has a pending SIGKILL or is exiting, then automatically
> >          * select it.  The goal is to allow it to allocate so that it may
> >          * quickly exit and free its memory.
> >          *
> >          * But don't select if current has already released its mm and cleared
> >          * TIF_MEMDIE flag at exit_mm(), otherwise an OOM livelock may occur.
> >          */
> >         if (current->mm &&
> >             (fatal_signal_pending(current) || task_will_free_mem(current))) {
> >                 mark_oom_victim(current);
> >                 return true;
> >         }
> > 
> > So if the current was exiting already we are not killing it, we just give it
> > access to memory reserves to expedite the exit. We do the same thing for the
> > memcg case.
> 
> The result is the same even if I do
> 
> -	BUG_ON(test_thread_flag(TIF_MEMDIE) && !fatal_signal_pending(current));
> +	BUG_ON(test_thread_flag(TIF_MEMDIE) && !fatal_signal_pending(current) && !task_will_free_mem(current));
> 
> . I think that task_will_free_mem() is always false because this BUG_ON()
> is located before "exit_signals(tsk);  /* sets PF_EXITING */" line.

I haven't checked where exactly you added the BUG_ON, I was merely
comenting on the possibility that TIF_MEMDIE is set without sending
SIGKILL.

Now that I am looking at your BUG_ON more closely I am wondering whether
it makes sense at all. The fatal signal has been dequeued in get_signal
before we call into do_group_exit AFAICS.

[...]
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]