On Thu 07-05-20 14:02:17, Vlastimil Babka wrote: > During recent patch discussion [1] it became apparent that the "other_node" > definition in the numastat documentation has always been different from actual > implementation. It was also noted that the stats can be innacurate on systems > with memoryless nodes. > > This patch corrects the other_node definition (with minor tweaks to two more > definitions), adds a note about memoryless nodes and also two introductory > paragraphs to the numastat documentation. > > [1] https://lore.kernel.org/linux-mm/20200504070304.127361-1-sandipan@xxxxxxxxxxxxx/T/#u > > Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx> Acked-by: Michal Hocko <mhocko@xxxxxxxx> Thanks! > --- > Documentation/admin-guide/numastat.rst | 31 +++++++++++++++++++++++--- > 1 file changed, 28 insertions(+), 3 deletions(-) > > diff --git a/Documentation/admin-guide/numastat.rst b/Documentation/admin-guide/numastat.rst > index aaf1667489f8..08ec2c2bdce3 100644 > --- a/Documentation/admin-guide/numastat.rst > +++ b/Documentation/admin-guide/numastat.rst > @@ -6,6 +6,21 @@ Numa policy hit/miss statistics > > All units are pages. Hugepages have separate counters. > > +The numa_hit, numa_miss and numa_foreign counters reflect how well processes > +are able to allocate memory from nodes they prefer. If they succeed, numa_hit > +is incremented on the preferred node, otherwise numa_foreign is incremented on > +the preferred node and numa_miss on the node where allocation succeeded. > + > +Usually preferred node is the one local to the CPU where the process executes, > +but restrictions such as mempolicies can change that, so there are also two > +counters based on CPU local node. local_node is similar to numa_hit and is > +incremented on allocation from a node by CPU on the same node. other_node is > +similar to numa_miss and is incremented on the node where allocation succeeds > +from a CPU from a different node. Note there is no counter analogical to > +numa_foreign. > + > +In more detail: > + > =============== ============================================================ > numa_hit A process wanted to allocate memory from this node, > and succeeded. > @@ -14,11 +29,13 @@ numa_miss A process wanted to allocate memory from another node, > but ended up with memory from this node. > > numa_foreign A process wanted to allocate on this node, > - but ended up with memory from another one. > + but ended up with memory from another node. > > -local_node A process ran on this node and got memory from it. > +local_node A process ran on this node's CPU, > + and got memory from this node. > > -other_node A process ran on this node and got memory from another node. > +other_node A process ran on a different node's CPU > + and got memory from this node. > > interleave_hit Interleaving wanted to allocate from this node > and succeeded. > @@ -28,3 +45,11 @@ For easier reading you can use the numastat utility from the numactl package > (http://oss.sgi.com/projects/libnuma/). Note that it only works > well right now on machines with a small number of CPUs. > > +Note that on systems with memoryless nodes (where a node has CPUs but no > +memory) the numa_hit, numa_miss and numa_foreign statistics can be skewed > +heavily. In the current kernel implementation, if a process prefers a > +memoryless node (i.e. because it is running on one of its local CPU), the > +implementation actually treats one of the nearest nodes with memory as the > +preferred node. As a result, such allocation will not increase the numa_foreign > +counter on the memoryless node, and will skew the numa_hit, numa_miss and > +numa_foreign statistics of the nearest node. > -- > 2.26.2 -- Michal Hocko SUSE Labs