Re: [PATCH v3 3/5] mm/hugetlb: fix getting refcount 0 page in hugetlb_fault()

Hugh Dickins <hughd@xxxxxxxxxx> · Mon, 29 Sep 2014 21:52:24 -0700 (PDT)

On Mon, 15 Sep 2014, Naoya Horiguchi wrote:
> When running the test which causes the race as shown in the previous patch,
> we can hit the BUG "get_page() on refcount 0 page" in hugetlb_fault().

Two minor comments...

> @@ -3192,22 +3208,19 @@ int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> 	 * Note that locking order is always pagecache_page -> page,
>  	 * so no worry about deadlock.

That sentence of comment is stale and should be deleted,
now that you're only doing a trylock_page(page) here.

>  out_mutex:
>  	mutex_unlock(&htlb_fault_mutex_table[hash]);
> +	if (need_wait_lock)
> +		wait_on_page_locked(page);
>  	return ret;
>  }

It will be hard to trigger any problem from this (I guess it would
need memory hotremove), but you ought really to hold a reference to
page while doing a wait_on_page_locked(page).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>