On Mon, Jul 29, 2013 at 02:32:07PM +0900, Joonsoo Kim wrote: > If we fail with a reserved page, just calling put_page() is not sufficient, > because put_page() invoke free_huge_page() at last step and it doesn't > know whether a page comes from a reserved pool or not. So it doesn't do > anything related to reserved count. This makes reserve count lower > than how we need, because reserve count already decrease in > dequeue_huge_page_vma(). This patch fix this situation. I think we could use a page flag (for example PG_reserve) on a hugepage in order to record that the hugepage comes from the reserved pool. Furthermore, the reserve flag would be set when dequeueing a free hugepage, and cleared when hugepage_fault returns, whether it fails or not. I think it's simpler than put_page variant approach, but doesn't it work to solve your problem? Thanks, Naoya Horiguchi > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index bb8a45f..6a9ec69 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -649,6 +649,34 @@ struct hstate *size_to_hstate(unsigned long size) > return NULL; > } > > +static void put_huge_page(struct page *page, int use_reserve) > +{ > + struct hstate *h = page_hstate(page); > + struct hugepage_subpool *spool = > + (struct hugepage_subpool *)page_private(page); > + > + if (!use_reserve) { > + put_page(page); > + return; > + } > + > + if (!put_page_testzero(page)) > + return; > + > + set_page_private(page, 0); > + page->mapping = NULL; > + BUG_ON(page_count(page)); > + BUG_ON(page_mapcount(page)); > + > + spin_lock(&hugetlb_lock); > + hugetlb_cgroup_uncharge_page(hstate_index(h), > + pages_per_huge_page(h), page); > + enqueue_huge_page(h, page); > + h->resv_huge_pages++; > + spin_unlock(&hugetlb_lock); > + hugepage_subpool_put_pages(spool, 1); > +} > + > static void free_huge_page(struct page *page) > { > /* > @@ -2625,7 +2653,7 @@ retry_avoidcopy: > spin_unlock(&mm->page_table_lock); > mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); > > - page_cache_release(new_page); > + put_huge_page(new_page, use_reserve); > out_old_page: > page_cache_release(old_page); > out_lock: > @@ -2725,7 +2753,7 @@ retry: > > err = add_to_page_cache(page, mapping, idx, GFP_KERNEL); > if (err) { > - put_page(page); > + put_huge_page(page, use_reserve); > if (err == -EEXIST) > goto retry; > goto out; > @@ -2798,7 +2826,7 @@ backout: > spin_unlock(&mm->page_table_lock); > backout_unlocked: > unlock_page(page); > - put_page(page); > + put_huge_page(page, use_reserve); > goto out; > } > > -- > 1.7.9.5 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@xxxxxxxxx. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>