Currently we always take a folio reference even if migration will not
even be tried or isolation failed, requiring us to grab+drop an additional
reference.
Further, we end up calling folio_likely_mapped_shared() while the folio
might have already been unmapped, because after we dropped the PTL, that
can easily happen. We want to stop touching mapcounts and friends from
such context, and only call folio_likely_mapped_shared() while the folio
is still mapped: mapcount information is pretty much stale and unreliable
otherwise.
So let's move checks into numamigrate_isolate_folio(), rename that
function to migrate_misplaced_folio_prepare(), and call that function
from callsites where we call migrate_misplaced_folio(), but still with
the PTL held.
We can now stop taking temporary folio references, and really only take
a reference if folio isolation succeeded. Doing the
folio_likely_mapped_shared() + golio isolation under PT lock is now similar
to how we handle MADV_PAGEOUT.
While at it, combine the folio_is_file_lru() checks.
Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
---
include/linux/migrate.h | 7 ++++
mm/huge_memory.c | 8 ++--
mm/memory.c | 9 +++--
mm/migrate.c | 81 +++++++++++++++++++----------------------
4 files changed, 55 insertions(+), 50 deletions(-)
diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index f9d92482d117..644be30b69c8 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -139,9 +139,16 @@ const struct movable_operations *page_movable_ops(struct page *page)
}
#ifdef CONFIG_NUMA_BALANCING
+int migrate_misplaced_folio_prepare(struct folio *folio,
+ struct vm_area_struct *vma, int node);
int migrate_misplaced_folio(struct folio *folio, struct vm_area_struct *vma,
int node);
#else
+static inline int migrate_misplaced_folio_prepare(struct folio *folio,
+ struct vm_area_struct *vma, int node)
+{
+ return -EAGAIN; /* can't migrate now */
+}
static inline int migrate_misplaced_folio(struct folio *folio,
struct vm_area_struct *vma, int node)
{
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index fc27dabcd8e3..4b2817bb2c7d 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1688,11 +1688,13 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf)
if (node_is_toptier(nid))
last_cpupid = folio_last_cpupid(folio);
target_nid = numa_migrate_prep(folio, vmf, haddr, nid, &flags);
- if (target_nid == NUMA_NO_NODE) {
- folio_put(folio);
+ if (target_nid == NUMA_NO_NODE)
+ goto out_map;
+ if (migrate_misplaced_folio_prepare(folio, vma, target_nid)) {
+ flags |= TNF_MIGRATE_FAIL;
goto out_map;
}
-
+ /* The folio is isolated and isolation code holds a folio reference. */
spin_unlock(vmf->ptl);
writable = false;
diff --git a/mm/memory.c b/mm/memory.c
index 118660de5bcc..4fd1ecfced4d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5207,8 +5207,6 @@ int numa_migrate_prep(struct folio *folio, struct vm_fault *vmf,
{
struct vm_area_struct *vma = vmf->vma;
- folio_get(folio);
-
/* Record the current PID acceesing VMA */
vma_set_access_pid_bit(vma);