[PATCH RFC] mm: fix refcount check in mapping_evict_folio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The commit e41c81d0d30e ("mm/truncate: Replace page_mapped() call in
invalidate_inode_page()") replaced the page_mapped(page) check with a
refcount check. However, this refcount check does not work as expected
with drop_caches, at least for btrfs's metadata pages.

Btrfs has a per-sb metadata inode with cached pages, and when not in
active use by btrfs, they have a refcount of 3. One from the initial
call to alloc_pages, one (nr_pages == 1) from filemap_add_folio, and one
from folio_attach_private. We would expect such pages to get dropped by
drop_caches. However, drop_caches calls into mapping_evict_folio via
mapping_try_invalidate which gets a reference on the folio with
find_lock_entries(). As a result, these pages have a refcount of 4, and
fail this check.

For what it's worth, such pages do get reclaimed under memory pressure,
and if I change this refcount check to `if folio_mapped(folio)`

The following script produces such pages and uses drgn to further
analyze the state of the folios:
https://github.com/boryas/scripts/blob/main/sh/strand-meta/run.sh
It should at least outline the basic idea for producing some btrfs
metadata via creating inlined-extent files.

My proposed fix for the issue is to add one more hardcoded refcount to
this check to account for the caller having a refcount on the page.
However, I am less familiar with the other caller into
mapping_evict_folio in the page fault path, so I am concerned this fix
will not work properly there, and would appreciate extra scrutiny there.

Fixes: e41c81d0d30e ("mm/truncate: Replace page_mapped() call in invalidate_inode_page()")
Signed-off-by: Boris Burkov <boris@xxxxxx>
---
 mm/truncate.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/mm/truncate.c b/mm/truncate.c
index 4d61fbdd4b2f..c710c84710b4 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -267,9 +267,17 @@ long mapping_evict_folio(struct address_space *mapping, struct folio *folio)
 		return 0;
 	if (folio_test_dirty(folio) || folio_test_writeback(folio))
 		return 0;
-	/* The refcount will be elevated if any page in the folio is mapped */
+	/*
+	 * The refcount will be elevated if any page in the folio is mapped.
+	 *
+	 * The refcounts break down as follows:
+	 * 1 per mapped page
+	 * 1 from folio_attach_private, if private is set
+	 * 1 from allocating the page in the first place
+	 * 1 from the caller
+	 */
 	if (folio_ref_count(folio) >
-			folio_nr_pages(folio) + folio_has_private(folio) + 1)
+			folio_nr_pages(folio) + folio_has_private(folio) + 1 + 1)
 		return 0;
 	if (!filemap_release_folio(folio, 0))
 		return 0;
-- 
2.46.0





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux