On Wed, 23 Oct 2024 20:03:24 +0200 Michal Hocko <mhocko@xxxxxxxx> wrote: > On Wed 23-10-24 10:50:37, Dongjoo Seo wrote: > > This patch corrects this issue by: > > What is this issue? Please describe the problem first, Actually, relocating the author's second-last paragraph to top-of-changelog produced a decent result ;) > ideally describe > the NUMA topology, workload and what kind of misaccounting happens > (expected values vs. really reported values). I think the changelog covered this adequately? So with these changelog alterations I've queued this for 6.12-rcX with a cc:stable. As far as I can tell this has been there since 2018. : In the case of memoryless node, when a process prefers a node with no : memory(e.g., because it is running on a CPU local to that node), the : kernel treats a nearby node with memory as the preferred node. As a : result, such allocations do not increment the numa_foreign counter on the : memoryless node, leading to skewed NUMA_HIT, NUMA_MISS, and NUMA_FOREIGN : stats for the nearest node. : : This patch corrects this issue by: : 1. Checking if the zone or preferred zone is CPU-less before updating : the NUMA stats. : 2. Ensuring NUMA_HIT is only updated if the zone is not CPU-less. : 3. Ensuring NUMA_FOREIGN is only updated if the preferred zone is not : CPU-less. : : Example Before and After Patch: : - Before Patch: : node0 node1 node2 : numa_hit 86333181 114338269 5108 : numa_miss 5199455 0 56844591 : numa_foreign 32281033 29763013 0 : interleave_hit 91 91 0 : local_node 86326417 114288458 0 : other_node 5206219 49768 56849702 : : - After Patch: : node0 node1 node2 : numa_hit 2523058 9225528 0 : numa_miss 150213 10226 21495942 : numa_foreign 17144215 4501270 0 : interleave_hit 91 94 0 : local_node 2493918 9208226 0 : other_node 179351 27528 21495942 : : Similarly, in the context of cpuless nodes, this patch ensures that NUMA : statistics are accurately updated by adding checks to prevent the : miscounting of memory allocations when the involved nodes have no CPUs. : This ensures more precise tracking of memory access patterns accross all : nodes, regardless of whether they have CPUs or not, improving the overall : reliability of NUMA stat. The reason is that page allocation from : dev_dax, cpuset, memcg .. comes with preferred allocating zone in cpuless : node and its hard to track the zone info for miss information. :