Re: [PATCH] mm: Do not reclaim private data from pinned page

David Hildenbrand <david@xxxxxxxxxx> · Tue, 2 May 2023 17:53:08 +0200

On 02.05.23 17:48, Peter Xu wrote:
On Tue, May 02, 2023 at 05:33:22PM +0200, David Hildenbrand wrote:
On 02.05.23 17:26, Peter Xu wrote:
On Fri, Apr 28, 2023 at 02:41:40PM +0200, Jan Kara wrote:
If the page is pinned, there's no point in trying to reclaim it.
Furthermore if the page is from the page cache we don't want to reclaim
fs-private data from the page because the pinning process may be writing
to the page at any time and reclaiming fs private info on a dirty page
can upset the filesystem (see link below).

Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@xxxxxxxxxxxxxx
Signed-off-by: Jan Kara <jack@xxxxxxx>
---
   mm/vmscan.c | 10 ++++++++++
   1 file changed, 10 insertions(+)

This was the non-controversial part of my series [1] dealing with pinned pages
in filesystems. It is already a win as it avoids crashes in the filesystem and
we can drop workarounds for this in ext4. Can we merge it please?

[1] https://lore.kernel.org/all/20230209121046.25360-1-jack@xxxxxxx/

diff --git a/mm/vmscan.c b/mm/vmscan.c
index bf3eedf0209c..401a379ea99a 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1901,6 +1901,16 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
   			}
   		}
+		/*
+		 * Folio is unmapped now so it cannot be newly pinned anymore.
+		 * No point in trying to reclaim folio if it is pinned.
+		 * Furthermore we don't want to reclaim underlying fs metadata
+		 * if the folio is pinned and thus potentially modified by the
+		 * pinning process as that may upset the filesystem.
+		 */
+		if (folio_maybe_dma_pinned(folio))
+			goto activate_locked;
+
   		mapping = folio_mapping(folio);
   		if (folio_test_dirty(folio)) {
   			/*
--
2.35.3



IIUC we have similar handling for anon (feb889fb40fafc).  Should we merge
the two sites and just move the check earlier?  Thanks,


feb889fb40fafc introduced a best-effort check that is racy, as the page is
still mapped (can still get pinned). Further, we get false positives most
only if a page is shared very often (1024 times), which happens rarely with
anon pages. Now that we handle COW+pinning correctly using
PageAnonExclusive, that check only optimizes for the "already pinned" case.
But it's not required for correctness anymore (so it can be racy).

Here, however, we want more precision, and not false positives simply
because a page is mapped many times (which can happen easily) or can still
get pinned while mapped.

Ah makes sense, thanks.

Acked-by: Peter Xu <peterx@xxxxxxxxxx>

This seems not obvious, though, if we simply read the two commits. It'll be
great if we mention it somewhere in either comment or commit message on the
relationship of the two checks.

I once had a patch lying around to document the existing check:

https://github.com/davidhildenbrand/linux/commit/abb01d42a99b56e2c5e707ba80ddc8b05ad7d618

--
Thanks,

David / dhildenb