Re: [REPOST] [PATCH 1/2] mm: Fix race between setting TIF_MEMDIE and __alloc_pages_high_priority().

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun 23-08-15 16:21:41, Tetsuo Handa wrote:
> >From 4a3cf5be07a66cf3906a380e77ba5e2ac1b2b3d5 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Date: Sat, 1 Aug 2015 22:39:30 +0900
> Subject: [PATCH 1/2] mm: Fix race between setting TIF_MEMDIE and
>  __alloc_pages_high_priority().
> 
> Currently, TIF_MEMDIE is checked at gfp_to_alloc_flags() which is before
> calling __alloc_pages_high_priority() and at
> 
>   /* Avoid allocations with no watermarks from looping endlessly */
>   if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL))
> 
> which is after returning from __alloc_pages_high_priority(). This means
> that if TIF_MEMDIE is set between returning from gfp_to_alloc_flags() and
> checking test_thread_flag(TIF_MEMDIE), the allocation can fail without
> calling __alloc_pages_high_priority(). We need to replace
> "test_thread_flag(TIF_MEMDIE)" with "whether TIF_MEMDIE was already set
> as of calling gfp_to_alloc_flags()" in order to close this race window.
> 
> Since gfp_to_alloc_flags() includes ALLOC_NO_WATERMARKS for several cases,
> it will be more correct to replace "test_thread_flag(TIF_MEMDIE)" with
> "whether gfp_to_alloc_flags() included ALLOC_NO_WATERMARKS" because the
> purpose of test_thread_flag(TIF_MEMDIE) is to give up immediately if
> __alloc_pages_high_priority() failed.

Yes TIF_MEMDIE setting is inherently racy. We will fail the allocation
without diving into reserves. Why is that a problem?
The comment above the check is misleading but now you are allowing to
fail all ALLOC_NO_WATERMARKS (without __GFP_NOFAIL) allocations before
entering the direct reclaim and compaction. This seems incorrect. What
about __GFP_MEMALLOC requests?

I think the check for TIF_MEMDIE makes more sense here.

> 
> Note that we could simply do
> 
>   if (alloc_flags & ALLOC_NO_WATERMARKS) {
>     ac->zonelist = node_zonelist(numa_node_id(), gfp_mask);
>     page = __alloc_pages_high_priority(gfp_mask, order, ac);
>     if (page)
>       goto got_pg;
>     WARN_ON_ONCE(!wait && (gfp_mask & __GFP_NOFAIL));
>     goto nopage;
>   }
> 
> instead of changing to
> 
>   if ((alloc_flags & ALLOC_NO_WATERMARKS) && !(gfp_mask & __GFP_NOFAIL))
>     goto nopage;
> 
> if we can duplicate
> 
>   if (!wait) {
>     WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL);
>     goto nopage;
>   }
> 
> .
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> ---
>  mm/page_alloc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 4b220cb..37a0390 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3085,7 +3085,7 @@ retry:
>  		goto nopage;
>  
>  	/* Avoid allocations with no watermarks from looping endlessly */
> -	if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL))
> +	if ((alloc_flags & ALLOC_NO_WATERMARKS) && !(gfp_mask & __GFP_NOFAIL))
>  		goto nopage;
>  
>  	/*
> -- 
> 1.8.3.1

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]