Re: zram OOM behavior

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 31, 2012 at 09:48:57PM -0700, David Rientjes wrote:
> On Thu, 1 Nov 2012, Minchan Kim wrote:
> 
> > It's not true any more.
> > 3.6 includes following code in try_to_free_pages
> > 
> >         /*   
> >          * Do not enter reclaim if fatal signal is pending. 1 is returned so
> >          * that the page allocator does not consider triggering OOM
> >          */
> >         if (fatal_signal_pending(current))
> >                 return 1;
> > 
> > So the hunged task never go to the OOM path and could be looping forever.
> > 
> 
> Ah, interesting.  This is from commit 5515061d22f0 ("mm: throttle direct 
> reclaimers if PF_MEMALLOC reserves are low and swap is backed by network 
> storage").  Thanks for adding Mel to the cc.
> 
> The oom killer specifically has logic for this condition: when calling 
> out_of_memory() the first thing it does is
> 
> 	if (fatal_signal_pending(current))
> 		set_thread_flag(TIF_MEMDIE);
> 
> to allow it access to memory reserves so that it may exit if it's having 
> trouble.  But that ends up never happening because of the above code that 
> Minchan has identified.
> 
> So we either need to do set_thread_flag(TIF_MEMDIE) in try_to_free_pages() 
> as well or revert that early return entirely; there's no justification 
> given for it in the comment nor in the commit log.  I'd rather remove it 
> and allow the oom killer to trigger and grant access to memory reserves 
> itself if necessary.
> 
> Mel, how does commit 5515061d22f0 deal with threads looping forever if 
> they need memory in the exit path since the oom killer never gets called?
> 
> That aside, it doesn't seem like this is the issue that Luigi is reporting 
> since his patch that avoids deferring the oom killer presumably fixes the 
> issue for him.  So it turns out the oom killer must be getting called.

Exactly.

> 
> Luigi, can you try this instead?  It applies to the latest git but should 
> be easily modified to apply to any 3.x kernel you're running.
> ---
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -310,26 +310,13 @@ enum oom_scan_t oom_scan_process_thread(struct task_struct *task,
>  	if (!task->mm)
>  		return OOM_SCAN_CONTINUE;
>  
> -	if (task->flags & PF_EXITING) {
> +	if (task->flags & PF_EXITING && !force_kill) {
>  		/*
> -		 * If task is current and is in the process of releasing memory,
> -		 * allow the "kill" to set TIF_MEMDIE, which will allow it to
> -		 * access memory reserves.  Otherwise, it may stall forever.
> -		 *
> -		 * The iteration isn't broken here, however, in case other
> -		 * threads are found to have already been oom killed.
> +		 * If this task is not being ptraced on exit, then wait for it
> +		 * to finish before killing some other task unnecessarily.
>  		 */
> -		if (task == current)
> -			return OOM_SCAN_SELECT;
> -		else if (!force_kill) {
> -			/*
> -			 * If this task is not being ptraced on exit, then wait
> -			 * for it to finish before killing some other task
> -			 * unnecessarily.
> -			 */
> -			if (!(task->group_leader->ptrace & PT_TRACE_EXIT))
> -				return OOM_SCAN_ABORT;
> -		}
> +		if (!(task->group_leader->ptrace & PT_TRACE_EXIT))
> +			return OOM_SCAN_ABORT;
>  	}
>  	return OOM_SCAN_OK;
>  }
> @@ -706,11 +693,11 @@ void out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask,
>  		return;
>  
>  	/*
> -	 * If current has a pending SIGKILL, then automatically select it.  The
> -	 * goal is to allow it to allocate so that it may quickly exit and free
> -	 * its memory.
> +	 * If current has a pending SIGKILL or is exiting, then automatically
> +	 * select it.  The goal is to allow it to allocate so that it may
> +	 * quickly exit and free its memory.
>  	 */
> -	if (fatal_signal_pending(current)) {
> +	if (fatal_signal_pending(current) || current->flags & PF_EXITING) {
>  		set_thread_flag(TIF_MEMDIE);
>  		return;
>  	}

Looks good to me.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]