[to-be-updated] mm-compaction-skip-memory-compaction-when-there-are-not-enough-migratable-pages.patch removed from -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The quilt patch titled
     Subject: mm: compaction: skip memory compaction when there are not enough migratable pages
has been removed from the -mm tree.  Its filename was
     mm-compaction-skip-memory-compaction-when-there-are-not-enough-migratable-pages.patch

This patch was dropped because an updated version will be issued

------------------------------------------------------
From: yangge <yangge1116@xxxxxxx>
Subject: mm: compaction: skip memory compaction when there are not enough migratable pages
Date: Wed, 8 Jan 2025 19:30:54 +0800

There are 4 NUMA nodes on my machine, and each NUMA node has 32GB of
memory.  I have configured 16GB of CMA memory on each NUMA node, and
starting a 32GB virtual machine with device passthrough is extremely slow,
taking almost an hour.

During the startup of the virtual machine, it will call
pin_user_pages_remote(..., FOLL_LONGTERM, ...) to allocate memory.  Long
term GUP cannot allocate memory from CMA area, so a maximum of 16 GB of
no-CMA memory on a NUMA node can be used as virtual machine memory.  There
is 16GB of free CMA memory on a NUMA node, which is sufficient to pass the
order-0 watermark check, causing the __compaction_suitable() function to
consistently return true.  However, if there aren't enough migratable
pages available, performing memory compaction is also meaningless. 
Besides checking whether the order-0 watermark is met,
__compaction_suitable() also needs to determine whether there are
sufficient migratable pages available for memory compaction.

For costly allocations, because __compaction_suitable() always returns
true, __alloc_pages_slowpath() can't exit at the appropriate place,
resulting in excessively long virtual machine startup times.

Call trace:
__alloc_pages_slowpath
    if (compact_result == COMPACT_SKIPPED ||
        compact_result == COMPACT_DEFERRED)
        goto nopage; // should exit __alloc_pages_slowpath() from here

When the 16G of non-CMA memory on a single node is exhausted, we will
fallback to allocating memory on other nodes.  In order to quickly
fallback to remote nodes, we should skip memory compaction when migratable
pages are insufficient.  After this fix, it only takes a few tens of
seconds to start a 32GB virtual machine with device passthrough
functionality.

Link: https://lkml.kernel.org/r/1736335854-548-1-git-send-email-yangge1116@xxxxxxx
Signed-off-by: yangge <yangge1116@xxxxxxx>
Cc: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
Cc: David Hildenbrand <david@xxxxxxxxxx>
Cc: Johannes Weiner <hannes@xxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 mm/compaction.c |   20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

--- a/mm/compaction.c~mm-compaction-skip-memory-compaction-when-there-are-not-enough-migratable-pages
+++ a/mm/compaction.c
@@ -2383,7 +2383,27 @@ static bool __compaction_suitable(struct
 				  int highest_zoneidx,
 				  unsigned long wmark_target)
 {
+	pg_data_t __maybe_unused *pgdat = zone->zone_pgdat;
+	unsigned long sum, nr_pinned;
 	unsigned long watermark;
+
+	sum = node_page_state(pgdat, NR_INACTIVE_FILE) +
+		node_page_state(pgdat, NR_INACTIVE_ANON) +
+		node_page_state(pgdat, NR_ACTIVE_FILE) +
+		node_page_state(pgdat, NR_ACTIVE_ANON) +
+		node_page_state(pgdat, NR_UNEVICTABLE);
+
+	nr_pinned = node_page_state(pgdat, NR_FOLL_PIN_ACQUIRED) -
+		node_page_state(pgdat, NR_FOLL_PIN_RELEASED);
+
+	/*
+	 * Gup-pinned pages are non-migratable. After subtracting these pages,
+	 * we need to check if the remaining pages are sufficient for memory
+	 * compaction.
+	 */
+	if ((sum - nr_pinned) < (1 << order))
+		return false;
+
 	/*
 	 * Watermarks for order-0 must be met for compaction to be able to
 	 * isolate free pages for migration targets. This means that the
_

Patches currently in -mm which might be from yangge1116@xxxxxxx are






[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux