Re: [PATCHv3 03/12] mm: fix handling PTE-mapped THPs in page_referenced()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun 29-01-17 20:38:49, Kirill A. Shutemov wrote:
> For PTE-mapped THP page_check_address_transhuge() is not adequate: it
> cannot find all relevant PTEs, only the first one. It means we can miss
> some references of the page and it can result in suboptimal decisions by
> vmscan.
> 
> Let's switch it to page_vma_mapped_walk().
> 
> I don't think it's subject for stable@: it's not fatal. The only side
> effect is that THP can be swapped out when it shouldn't.

Please be more specific about the situation when this happens and how a
user can recognize this is going on. In other words when should I
consider backporting this series.

Also the interface is quite awkward imho. Why cannot we provide a
callback into page_vma_mapped_walk and call it for each pte/pmd that
matters to the given page? Wouldn't that be much easier than the loop
around page_vma_mapped_walk iterator?

> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> ---
>  mm/rmap.c | 66 ++++++++++++++++++++++++++++++++-------------------------------
>  1 file changed, 34 insertions(+), 32 deletions(-)
> 
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 91619fd70939..0dff8accd629 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -886,45 +886,48 @@ struct page_referenced_arg {
>  static int page_referenced_one(struct page *page, struct vm_area_struct *vma,
>  			unsigned long address, void *arg)
>  {
> -	struct mm_struct *mm = vma->vm_mm;
>  	struct page_referenced_arg *pra = arg;
> -	pmd_t *pmd;
> -	pte_t *pte;
> -	spinlock_t *ptl;
> +	struct page_vma_mapped_walk pvmw = {
> +		.page = page,
> +		.vma = vma,
> +		.address = address,
> +	};
>  	int referenced = 0;
>  
> -	if (!page_check_address_transhuge(page, mm, address, &pmd, &pte, &ptl))
> -		return SWAP_AGAIN;
> +	while (page_vma_mapped_walk(&pvmw)) {
> +		address = pvmw.address;
>  
> -	if (vma->vm_flags & VM_LOCKED) {
> -		if (pte)
> -			pte_unmap(pte);
> -		spin_unlock(ptl);
> -		pra->vm_flags |= VM_LOCKED;
> -		return SWAP_FAIL; /* To break the loop */
> -	}
> +		if (vma->vm_flags & VM_LOCKED) {
> +			page_vma_mapped_walk_done(&pvmw);
> +			pra->vm_flags |= VM_LOCKED;
> +			return SWAP_FAIL; /* To break the loop */
> +		}
>  
> -	if (pte) {
> -		if (ptep_clear_flush_young_notify(vma, address, pte)) {
> -			/*
> -			 * Don't treat a reference through a sequentially read
> -			 * mapping as such.  If the page has been used in
> -			 * another mapping, we will catch it; if this other
> -			 * mapping is already gone, the unmap path will have
> -			 * set PG_referenced or activated the page.
> -			 */
> -			if (likely(!(vma->vm_flags & VM_SEQ_READ)))
> +		if (pvmw.pte) {
> +			if (ptep_clear_flush_young_notify(vma, address,
> +						pvmw.pte)) {
> +				/*
> +				 * Don't treat a reference through
> +				 * a sequentially read mapping as such.
> +				 * If the page has been used in another mapping,
> +				 * we will catch it; if this other mapping is
> +				 * already gone, the unmap path will have set
> +				 * PG_referenced or activated the page.
> +				 */
> +				if (likely(!(vma->vm_flags & VM_SEQ_READ)))
> +					referenced++;
> +			}
> +		} else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> +			if (pmdp_clear_flush_young_notify(vma, address,
> +						pvmw.pmd))
>  				referenced++;
> +		} else {
> +			/* unexpected pmd-mapped page? */
> +			WARN_ON_ONCE(1);
>  		}
> -		pte_unmap(pte);
> -	} else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> -		if (pmdp_clear_flush_young_notify(vma, address, pmd))
> -			referenced++;
> -	} else {
> -		/* unexpected pmd-mapped page? */
> -		WARN_ON_ONCE(1);
> +
> +		pra->mapcount--;
>  	}
> -	spin_unlock(ptl);
>  
>  	if (referenced)
>  		clear_page_idle(page);
> @@ -936,7 +939,6 @@ static int page_referenced_one(struct page *page, struct vm_area_struct *vma,
>  		pra->vm_flags |= vma->vm_flags;
>  	}
>  
> -	pra->mapcount--;
>  	if (!pra->mapcount)
>  		return SWAP_SUCCESS; /* To break the loop */
>  
> -- 
> 2.11.0
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]
  Powered by Linux