[PATCH] mm: get page_cache_get_speculative() work on tail pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Generic RCU fast GUP rely on page_cache_get_speculative() to obtain pin
on pte-mapped page.  As pointed by Aneesh during review of my compound
pages refcounting rework, page_cache_get_speculative() would fail on
pte-mapped tail page, since tail pages always have page->_count == 0.

That means we would never be able to successfully obtain pin on
pte-mapped tail page via generic RCU fast GUP.

But the problem is not exclusive to my patchset. In current kernel some
drivers (sound, for instance) already map compound pages with PTEs.

Let's teach page_cache_get_speculative() about tail. We can acquire pin
by speculatively taking pin on head page and recheck that compound page
didn't disappear under us. Retry if it did.

We don't care about THP tail page refcounting -- THP *tail* pages
shouldn't be found where page_cache_get_speculative() is used --
pagecache radix tree or page tables.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
Reported-by: "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxxxxxxx>
Cc: Steve Capper <steve.capper@xxxxxxxxxx>
Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Cc: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
---
 include/linux/pagemap.h | 31 ++++++++++++++++++++++++++-----
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 7c3790764795..573a2510da36 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -142,8 +142,10 @@ void release_pages(struct page **pages, int nr, bool cold);
  */
 static inline int page_cache_get_speculative(struct page *page)
 {
+	struct page *head_page;
 	VM_BUG_ON(in_interrupt());
-
+retry:
+	head_page = compound_head_fast(page);
 #ifdef CONFIG_TINY_RCU
 # ifdef CONFIG_PREEMPT_COUNT
 	VM_BUG_ON(!in_atomic());
@@ -157,11 +159,11 @@ static inline int page_cache_get_speculative(struct page *page)
 	 * disabling preempt, and hence no need for the "speculative get" that
 	 * SMP requires.
 	 */
-	VM_BUG_ON_PAGE(page_count(page) == 0, page);
-	atomic_inc(&page->_count);
+	VM_BUG_ON_PAGE(page_count(head_page) == 0, head_page);
+	atomic_inc(&head_page->_count);
 
 #else
-	if (unlikely(!get_page_unless_zero(page))) {
+	if (unlikely(!get_page_unless_zero(head_page))) {
 		/*
 		 * Either the page has been freed, or will be freed.
 		 * In either case, retry here and the caller should
@@ -170,7 +172,26 @@ static inline int page_cache_get_speculative(struct page *page)
 		return 0;
 	}
 #endif
-	VM_BUG_ON_PAGE(PageTail(page), page);
+	/* compound_head_fast() seen PageTail(page) == true */
+	if (unlikely(head_page != page)) {
+		/*
+		 * compound_head_fast() could fetch dangling page->first_page
+		 * pointer to an old compound page, so recheck that it's still
+		 * a tail page before returning.
+		 */
+		smp_mb__after_atomic();
+		if (unlikely(!PageTail(page))) {
+			put_page(head_page);
+			goto retry;
+		}
+		/*
+		 * Tail page refcounting is only required for THP pages.
+		 * If page_cache_get_speculative() got called on tail-THP pages
+		 * something went horribly wrong. We don't have THP in pagecache
+		 * and we don't map tail-THP to page tables.
+		 */
+		VM_BUG_ON_PAGE(compound_tail_refcounted(head_page), head_page);
+	}
 
 	return 1;
 }
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]