Re: [PATCH] mm: incorporate read-only pages into transparent huge pages

Rik van Riel <riel@xxxxxxxxxx> · Fri, 23 Jan 2015 14:04:11 -0500

On 01/23/2015 02:47 AM, Ebru Akagunduz wrote:

> @@ -2169,7 +2169,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>  		VM_BUG_ON_PAGE(!PageSwapBacked(page), page);
>  
>  		/* cannot use mapcount: can't collapse if there's a gup pin */
> -		if (page_count(page) != 1)
> +		if (page_count(page) != 1 + !!PageSwapCache(page))
>  			goto out;
>  		/*
>  		 * We can do it before isolate_lru_page because the
> @@ -2179,6 +2179,17 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>  		 */
>  		if (!trylock_page(page))
>  			goto out;
> +		if (!pte_write(pteval)) {
> +			if (PageSwapCache(page) && !reuse_swap_page(page)) {
> +					unlock_page(page);
> +					goto out;
> +			}
> +			/*
> +			 * Page is not in the swap cache, and page count is
> +			 * one (see above). It can be collapsed into a THP.
> +			 */
> +		}

Andrea pointed out a bug between the above two parts of
the patch.

In-between where we check page_count(page), and where we
check whether the page got added to the swap cache, the
page count may change, causing us to get into a race
condition with get_user_pages_fast, the pageout code, etc.

It is necessary to check the page count again right after
the trylock_page(page) above, to make sure it was not changed
while the page was not yet locked.

That second check should have a comment explaining that
the first "page_count(page) != 1 + !!PageSwapCache(page)"
check could be unsafe due to the page not yet locked,
so the check needs to be repeated. Maybe something along
the lines of:

     /* Re-check the page count with the page locked */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>