On 1/2/22 13:57, Matthew Wilcox (Oracle) wrote:
This replaces try_get_compound_head(). It includes a small optimisation
for the race where a folio is split between being looked up from its
tail page and the reference count being obtained. Before, it returned
NULL, which presumably triggered a retry under the mmap_lock, whereas
now it will retry without the lock. Finding a frozen page will still
return NULL.
Signed-off-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
---
mm/gup.c | 69 +++++++++++++++++++++++++++++---------------------------
1 file changed, 36 insertions(+), 33 deletions(-)
diff --git a/mm/gup.c b/mm/gup.c
index 2c51e9748a6a..58e5cfaaa676 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -29,12 +29,11 @@ struct follow_page_context {
unsigned int page_mask;
};
-static void hpage_pincount_add(struct page *page, int refs)
+static void folio_pincount_add(struct folio *folio, int refs)
{
- VM_BUG_ON_PAGE(!hpage_pincount_available(page), page);
- VM_BUG_ON_PAGE(page != compound_head(page), page);
+ VM_BUG_ON_FOLIO(!folio_pincount_available(folio), folio);
- atomic_add(refs, compound_pincount_ptr(page));
+ atomic_add(refs, folio_pincount_ptr(folio));
}
static void hpage_pincount_sub(struct page *page, int refs)
@@ -63,33 +62,35 @@ static void put_page_refs(struct page *page, int refs)
}
/*
- * Return the compound head page with ref appropriately incremented,
+ * Return the folio with ref appropriately incremented,
* or NULL if that failed.
*/
-static inline struct page *try_get_compound_head(struct page *page, int refs)
+static inline struct folio *try_get_folio(struct page *page, int refs)
{
- struct page *head = compound_head(page);
+ struct folio *folio;
- if (WARN_ON_ONCE(page_ref_count(head) < 0))
+retry:
Yes, this new retry looks like a solid improvement. Retrying at this low
level makes a lot of sense, given that it is racing with a very
transient sort of behavior.
+ folio = page_folio(page);
+ if (WARN_ON_ONCE(folio_ref_count(folio) < 0))
return NULL;
- if (unlikely(!page_cache_add_speculative(head, refs)))
+ if (unlikely(!folio_ref_try_add_rcu(folio, refs)))
I'm a little lost about the meaning and intended use of the _rcu aspects
of folio_ref_try_add_rcu() here. For example, try_get_folio() does not
require that callers are in an rcu read section, right? This is probably
just a documentation question, sorry if it's obvious--I wasn't able to
work it out on my own.
return NULL;
/*
- * At this point we have a stable reference to the head page; but it
- * could be that between the compound_head() lookup and the refcount
- * increment, the compound page was split, in which case we'd end up
- * holding a reference on a page that has nothing to do with the page
+ * At this point we have a stable reference to the folio; but it
+ * could be that between calling page_folio() and the refcount
+ * increment, the folio was split, in which case we'd end up
+ * holding a reference on a folio that has nothing to do with the page
* we were given anymore.
- * So now that the head page is stable, recheck that the pages still
- * belong together.
+ * So now that the folio is stable, recheck that the page still
+ * belongs to this folio.
*/
- if (unlikely(compound_head(page) != head)) {
- put_page_refs(head, refs);
- return NULL;
+ if (unlikely(page_folio(page) != folio)) {
+ folio_put_refs(folio, refs);
+ goto retry;
}
- return head;
+ return folio;
}
/**
@@ -128,8 +129,10 @@ struct page *try_grab_compound_head(struct page *page,
int refs, unsigned int flags)
{
if (flags & FOLL_GET)
- return try_get_compound_head(page, refs);
+ return &try_get_folio(page, refs)->page;
Did you want to use folio_page() here, instead?
else if (flags & FOLL_PIN) {
+ struct folio *folio;
+
/*
* Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a
* right zone, so fail and let the caller fall back to the slow
@@ -143,29 +146,29 @@ struct page *try_grab_compound_head(struct page *page,
* CAUTION: Don't use compound_head() on the page before this
* point, the result won't be stable.
*/
- page = try_get_compound_head(page, refs);
- if (!page)
+ folio = try_get_folio(page, refs);
+ if (!folio)
return NULL;
/*
- * When pinning a compound page of order > 1 (which is what
+ * When pinning a folio of order > 1 (which is what
* hpage_pincount_available() checks for), use an exact count to
- * track it, via hpage_pincount_add/_sub().
+ * track it, via folio_pincount_add/_sub().
*
- * However, be sure to *also* increment the normal page refcount
- * field at least once, so that the page really is pinned.
+ * However, be sure to *also* increment the normal folio refcount
+ * field at least once, so that the folio really is pinned.
* That's why the refcount from the earlier
- * try_get_compound_head() is left intact.
+ * try_get_folio() is left intact.
*/
- if (hpage_pincount_available(page))
- hpage_pincount_add(page, refs);
+ if (folio_pincount_available(folio))
+ folio_pincount_add(folio, refs);
else
- page_ref_add(page, refs * (GUP_PIN_COUNTING_BIAS - 1));
+ folio_ref_add(folio,
+ refs * (GUP_PIN_COUNTING_BIAS - 1));
- mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_ACQUIRED,
- refs);
+ node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, refs);
- return page;
+ return &folio->page;
}
WARN_ON_ONCE(1);
Just minor questions above. This all looks solid though.
thanks,
--
John Hubbard
NVIDIA