[PATCH v6 0/4] numa: describe sibling nodes distances

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Wim ten Have <wim.ten.have@xxxxxxxxxx>

This patch extends guest domain administration adding support to
advertise node sibling distances when configuring NUMA guests also
referred to as vNUMA (Virtual NUMA).

NUMA (Non-Uniform Memory Access), a method of configuring a cluster
of nodes within a single multiprocessing system such that it shares
processor local memory amongst others improving performance and the
ability of the system to be expanded.

A NUMA system could be illustrated as shown below. Within this 4-NODE
system, every socket is equipped with its own distinct memory and some
with I/O. Access to memory or I/O on remote nodes is only possible
through the "Interconnect". This results in different performance for
local and remote resources.

In contrast to NUMA we recognize the flat SMP system where no concept
of local or remote resource exists.  The disadvantage of high socket
count SMP systems is that the shared bus can easily become a performance
bottleneck under high activity.

    +-------------+-------+        +-------+-------------+
    |NODE0|       |       |        |       |       |NODE3|
    |     | CPU00 | CPU03 |        | CPU12 | CPU15 |     |
    |     |       |       |        |       |       |     |
    | Mem +--- Socket0 ---<-------->--- Socket3 ---+ Mem |
    |     |       |       |        |       |       |     |
    +-----+ CPU01 | CPU02 |        | CPU13 | CPU14 |     |
    | I/O |       |       |        |       |       |     |
    +-----+-------^-------+        +-------^-------+-----+
                  |                        |
                  |      Interconnect      |
                  |                        |
    +-------------v-------+        +-------v-------------+
    |NODE1|       |       |        |       |       |NODE2|
    |     | CPU04 | CPU07 |        | CPU08 | CPU11 |     |
    |     |       |       |        |       |       |     |
    | Mem +--- Socket1 ---<-------->--- Socket2 ---+ Mem |
    |     |       |       |        |       |       |     |
    +-----+ CPU05 | CPU06 |        | CPU09 | CPU10 |     |
    | I/O |       |       |        |       |       |     |
    +-----+-------+-------+        +-------+-------+-----+

NUMA adds an intermediate level of memory shared amongst a few cores
per socket as illustrated above, so that data accesses do not have to
travel over a single bus.

Unfortunately the way NUMA does this adds its own limitations. This,
as visualized in the illustration above, happens when data is stored in
memory associated with Socket2 and is accessed by a CPU (core) in Socket0.
The processors use the "Interconnect" path to access resource on other
nodes. These "Interconnect" hops add data access delays. It is therefore
in our interest to describe the relative distances between nodes.

The relative distances between nodes are described in the system's SLIT
(System Locality Distance Information Table) which is part of the ACPI
(Advanced Configuration and Power Interface) specification.

On Linux systems the SLIT detail can be listed with help of the
'numactl -H' command. The above guest would show the following output.

    [root@f25 ~]# numactl -H
    available: 4 nodes (0-3)
    node 0 cpus: 0 1 2 3
    node 0 size: 2007 MB
    node 0 free: 1931 MB
    node 1 cpus: 4 5 6 7
    node 1 size: 1951 MB
    node 1 free: 1902 MB
    node 2 cpus: 8 9 10 11
    node 2 size: 1998 MB
    node 2 free: 1910 MB
    node 3 cpus: 12 13 14 15
    node 3 size: 2015 MB
    node 3 free: 1907 MB
    node distances:
    node   0   1   2   3
      0:  10  21  31  21
      1:  21  10  21  31
      2:  31  21  10  21
      3:  21  31  21  10
    
These patches extend core libvirt's XML description of NUMA cells to
include NUMA distance information and propagate it to Xen guests via
libxl.  Recently qemu landed support for constructing the SLIT since
commit 0f203430dd ("numa: Allow setting NUMA distance for different NUMA
nodes"). The core libvirt extensions in this patch set could be used to
propagate NUMA distances to qemu quests in the future.

Wim ten Have (4):
  numa: describe siblings distances within cells
  xenconfig: add domxml conversions for xen-xl
  libxl: vnuma support
  xlconfigtest: add tests for numa cell sibling distances

 docs/formatdomain.html.in                          |  63 +++-
 docs/schemas/basictypes.rng                        |   7 +
 docs/schemas/cputypes.rng                          |  18 ++
 src/conf/numa_conf.c                               | 328 +++++++++++++++++++-
 src/conf/numa_conf.h                               |  25 ++
 src/libvirt_private.syms                           |   5 +
 src/libxl/libxl_conf.c                             | 119 ++++++++
 src/libxl/libxl_domain.c                           |   7 +-
 src/xenconfig/xen_xl.c                             | 335 +++++++++++++++++++++
 tests/libxlxml2domconfigdata/basic-hvm.json        |  95 +++++-
 tests/libxlxml2domconfigdata/basic-hvm.xml         |  66 +++-
 tests/virmocklibxl.c                               |  13 +
 .../test-fullvirt-vnuma-autocomplete.cfg           |  26 ++
 .../test-fullvirt-vnuma-autocomplete.xml           |  85 ++++++
 .../test-fullvirt-vnuma-nodistances.cfg            |  26 ++
 .../test-fullvirt-vnuma-nodistances.xml            |  53 ++++
 .../test-fullvirt-vnuma-partialdist.cfg            |  26 ++
 .../test-fullvirt-vnuma-partialdist.xml            |  60 ++++
 tests/xlconfigdata/test-fullvirt-vnuma.cfg         |  26 ++
 tests/xlconfigdata/test-fullvirt-vnuma.xml         |  81 +++++
 tests/xlconfigtest.c                               |   6 +
 21 files changed, 1461 insertions(+), 9 deletions(-)
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-autocomplete.cfg
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-autocomplete.xml
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-nodistances.cfg
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-nodistances.xml
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-partialdist.cfg
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-partialdist.xml
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma.cfg
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma.xml

-- 
2.13.6

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list



[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]
  Powered by Linux