Re: [patch added to 3.12-stable] mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEED

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Greg, could you take this backport to 3.14 too?

On 08/09/2016, 01:38 PM, Jiri Slaby wrote:
> From: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> 
> This patch has been added to the 3.12 stable tree. If you have any
> objections, please let us know.
> 
> ===============
> 
> commit ad33bb04b2a6cee6c1f99fabb15cddbf93ff0433 upstream.
> 
> pmd_trans_unstable()/pmd_none_or_trans_huge_or_clear_bad() were
> introduced to locklessy (but atomically) detect when a pmd is a regular
> (stable) pmd or when the pmd is unstable and can infinitely transition
> from pmd_none() and pmd_trans_huge() from under us, while only holding
> the mmap_sem for reading (for writing not).
> 
> While holding the mmap_sem only for reading, MADV_DONTNEED can run from
> under us and so before we can assume the pmd to be a regular stable pmd
> we need to compare it against pmd_none() and pmd_trans_huge() in an
> atomic way, with pmd_trans_unstable().  The old pmd_trans_huge() left a
> tiny window for a race.
> 
> Useful applications are unlikely to notice the difference as doing
> MADV_DONTNEED concurrently with a page fault would lead to undefined
> behavior.
> 
> [js] 3.12 backport: no pmd_devmap in 3.12 yet.
> 
> [akpm@xxxxxxxxxxxxxxxxxxxx: tidy up comment grammar/layout]
> Signed-off-by: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> Reported-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Vlastimil Babka <vbabka@xxxxxxx>
> Signed-off-by: Jiri Slaby <jslaby@xxxxxxx>
> ---
>  mm/memory.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index d0d84c36cd5c..61926356c09a 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3814,8 +3814,18 @@ static int __handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  	if (unlikely(pmd_none(*pmd)) &&
>  	    unlikely(__pte_alloc(mm, vma, pmd, address)))
>  		return VM_FAULT_OOM;
> -	/* if an huge pmd materialized from under us just retry later */
> -	if (unlikely(pmd_trans_huge(*pmd)))
> +	/*
> +	 * If a huge pmd materialized under us just retry later.  Use
> +	 * pmd_trans_unstable() instead of pmd_trans_huge() to ensure the pmd
> +	 * didn't become pmd_trans_huge under us and then back to pmd_none, as
> +	 * a result of MADV_DONTNEED running immediately after a huge pmd fault
> +	 * in a different thread of this mm, in turn leading to a misleading
> +	 * pmd_trans_huge() retval.  All we have to ensure is that it is a
> +	 * regular pmd that we can walk with pte_offset_map() and we can do that
> +	 * through an atomic read in C, which is what pmd_trans_unstable()
> +	 * provides.
> +	 */
> +	if (unlikely(pmd_trans_unstable(pmd)))
>  		return 0;
>  	/*
>  	 * A regular pmd is established and it can't morph into a huge pmd
> 

thanks,
-- 
js
suse labs
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]