+ mm-page_alloc-fix-numa-stats-update-for-cpu-less-nodes.patch added to mm-hotfixes-unstable branch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: mm/page_alloc: fix NUMA stats update for cpu-less nodes
has been added to the -mm mm-hotfixes-unstable branch.  Its filename is
     mm-page_alloc-fix-numa-stats-update-for-cpu-less-nodes.patch

This patch will shortly appear at
     https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-page_alloc-fix-numa-stats-update-for-cpu-less-nodes.patch

This patch will later appear in the mm-hotfixes-unstable branch at
    git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days

------------------------------------------------------
From: Dongjoo Seo <dongjoo.linux.dev@xxxxxxxxx>
Subject: mm/page_alloc: fix NUMA stats update for cpu-less nodes
Date: Wed, 23 Oct 2024 10:50:37 -0700

In the case of memoryless node, when a process prefers a node with no
memory(e.g., because it is running on a CPU local to that node), the
kernel treats a nearby node with memory as the preferred node.  As a
result, such allocations do not increment the numa_foreign counter on the
memoryless node, leading to skewed NUMA_HIT, NUMA_MISS, and NUMA_FOREIGN
stats for the nearest node.

This patch corrects this issue by:
1. Checking if the zone or preferred zone is CPU-less before updating
   the NUMA stats.
2. Ensuring NUMA_HIT is only updated if the zone is not CPU-less.
3. Ensuring NUMA_FOREIGN is only updated if the preferred zone is not
   CPU-less.

Example Before and After Patch:
- Before Patch:
 node0                   node1           node2
 numa_hit                86333181       114338269            5108
 numa_miss                5199455               0        56844591
 numa_foreign            32281033        29763013               0
 interleave_hit                91              91               0
 local_node              86326417       114288458               0
 other_node               5206219           49768        56849702

- After Patch:
                            node0           node1           node2
 numa_hit                 2523058         9225528               0
 numa_miss                 150213           10226        21495942
 numa_foreign            17144215         4501270               0
 interleave_hit                91              94               0
 local_node               2493918         9208226               0
 other_node                179351           27528        21495942

Similarly, in the context of cpuless nodes, this patch ensures that NUMA
statistics are accurately updated by adding checks to prevent the
miscounting of memory allocations when the involved nodes have no CPUs. 
This ensures more precise tracking of memory access patterns accross all
nodes, regardless of whether they have CPUs or not, improving the overall
reliability of NUMA stat.  The reason is that page allocation from
dev_dax, cpuset, memcg ..  comes with preferred allocating zone in cpuless
node and its hard to track the zone info for miss information.

Link: https://lkml.kernel.org/r/20241023175037.9125-1-dongjoo.linux.dev@xxxxxxxxx
Signed-off-by: Dongjoo Seo <dongjoo.linux.dev@xxxxxxxxx>
Cc: Davidlohr Bueso <dave@xxxxxxxxxxxx>
Cc: Fan Ni <nifan@xxxxxxxxxxx>
Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
Cc: Adam Manzanares <a.manzanares@xxxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 mm/page_alloc.c |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

--- a/mm/page_alloc.c~mm-page_alloc-fix-numa-stats-update-for-cpu-less-nodes
+++ a/mm/page_alloc.c
@@ -2858,19 +2858,21 @@ static inline void zone_statistics(struc
 {
 #ifdef CONFIG_NUMA
 	enum numa_stat_item local_stat = NUMA_LOCAL;
+	bool z_is_cpuless = !node_state(zone_to_nid(z), N_CPU);
+	bool pref_is_cpuless = !node_state(zone_to_nid(preferred_zone), N_CPU);
 
-	/* skip numa counters update if numa stats is disabled */
 	if (!static_branch_likely(&vm_numa_stat_key))
 		return;
 
-	if (zone_to_nid(z) != numa_node_id())
+	if (zone_to_nid(z) != numa_node_id() || z_is_cpuless)
 		local_stat = NUMA_OTHER;
 
-	if (zone_to_nid(z) == zone_to_nid(preferred_zone))
+	if (zone_to_nid(z) == zone_to_nid(preferred_zone) && !z_is_cpuless)
 		__count_numa_events(z, NUMA_HIT, nr_account);
 	else {
 		__count_numa_events(z, NUMA_MISS, nr_account);
-		__count_numa_events(preferred_zone, NUMA_FOREIGN, nr_account);
+		if (!pref_is_cpuless)
+			__count_numa_events(preferred_zone, NUMA_FOREIGN, nr_account);
 	}
 	__count_numa_events(z, local_stat, nr_account);
 #endif
_

Patches currently in -mm which might be from dongjoo.linux.dev@xxxxxxxxx are

mm-page_alloc-fix-numa-stats-update-for-cpu-less-nodes.patch





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux