From: Wim ten Have <wim.ten.have@xxxxxxxxxx> This patch extends guest domain administration adding support to advertise node sibling distances when configuring NUMA guests also referred to as vNUMA (Virtual NUMA). NUMA (Non-Uniform Memory Access), a method of configuring a cluster of nodes within a single multiprocessing system such that it shares processor local memory amongst others improving performance and the ability of the system to be expanded. A NUMA system could be illustrated as shown below. Within this 4-NODE system, every socket is equipped with its own distinct memory and some with I/O. Access to memory or I/O on remote nodes is only possible through the "Interconnect". This results in different performance for local and remote resources. In contrast to NUMA we recognize the flat SMP system where no concept of local or remote resource exists. The disadvantage of high socket count SMP systems is that the shared bus can easily become a performance bottleneck under high activity. +-------------+-------+ +-------+-------------+ |NODE0| | | | | |NODE3| | | CPU00 | CPU03 | | CPU12 | CPU15 | | | | | | | | | | | Mem +--- Socket0 ---<-------->--- Socket3 ---+ Mem | | | | | | | | | +-----+ CPU01 | CPU02 | | CPU13 | CPU14 | | | I/O | | | | | | | +-----+-------^-------+ +-------^-------+-----+ | | | Interconnect | | | +-------------v-------+ +-------v-------------+ |NODE1| | | | | |NODE2| | | CPU04 | CPU07 | | CPU08 | CPU11 | | | | | | | | | | | Mem +--- Socket1 ---<-------->--- Socket2 ---+ Mem | | | | | | | | | +-----+ CPU05 | CPU06 | | CPU09 | CPU10 | | | I/O | | | | | | | +-----+-------+-------+ +-------+-------+-----+ NUMA adds an intermediate level of memory shared amongst a few cores per socket as illustrated above, so that data accesses do not have to travel over a single bus. Unfortunately the way NUMA does this adds its own limitations. This, as visualized in the illustration above, happens when data is stored in memory associated with Socket2 and is accessed by a CPU (core) in Socket0. The processors use the "Interconnect" path to access resource on other nodes. These "Interconnect" hops add data access delays. It is therefore in our interest to describe the relative distances between nodes. The relative distances between nodes are described in the system's SLIT (System Locality Distance Information Table) which is part of the ACPI (Advanced Configuration and Power Interface) specification. On Linux systems the SLIT detail can be listed with help of the 'numactl -H' command. The above guest would show the following output. [root@f25 ~]# numactl -H available: 4 nodes (0-3) node 0 cpus: 0 1 2 3 node 0 size: 2007 MB node 0 free: 1931 MB node 1 cpus: 4 5 6 7 node 1 size: 1951 MB node 1 free: 1902 MB node 2 cpus: 8 9 10 11 node 2 size: 1998 MB node 2 free: 1910 MB node 3 cpus: 12 13 14 15 node 3 size: 2015 MB node 3 free: 1907 MB node distances: node 0 1 2 3 0: 10 21 31 21 1: 21 10 21 31 2: 31 21 10 21 3: 21 31 21 10 These patches extend core libvirt's XML description of NUMA cells to include NUMA distance information and propagate it to Xen guests via libxl. Recently qemu landed support for constructing the SLIT since commit 0f203430dd ("numa: Allow setting NUMA distance for different NUMA nodes"). The core libvirt extensions in this patch set could be used to propagate NUMA distances to qemu quests in the future. Wim ten Have (4): numa: describe siblings distances within cells xenconfig: add domxml conversions for xen-xl libxl: vnuma support xlconfigtest: add tests for numa cell sibling distances docs/formatdomain.html.in | 63 +++- docs/schemas/basictypes.rng | 7 + docs/schemas/cputypes.rng | 18 ++ src/conf/numa_conf.c | 328 +++++++++++++++++++- src/conf/numa_conf.h | 25 ++ src/libvirt_private.syms | 5 + src/libxl/libxl_conf.c | 119 ++++++++ src/libxl/libxl_domain.c | 7 +- src/xenconfig/xen_xl.c | 335 +++++++++++++++++++++ tests/libxlxml2domconfigdata/basic-hvm.json | 95 +++++- tests/libxlxml2domconfigdata/basic-hvm.xml | 66 +++- tests/virmocklibxl.c | 13 + .../test-fullvirt-vnuma-autocomplete.cfg | 26 ++ .../test-fullvirt-vnuma-autocomplete.xml | 85 ++++++ .../test-fullvirt-vnuma-nodistances.cfg | 26 ++ .../test-fullvirt-vnuma-nodistances.xml | 53 ++++ .../test-fullvirt-vnuma-partialdist.cfg | 26 ++ .../test-fullvirt-vnuma-partialdist.xml | 60 ++++ tests/xlconfigdata/test-fullvirt-vnuma.cfg | 26 ++ tests/xlconfigdata/test-fullvirt-vnuma.xml | 81 +++++ tests/xlconfigtest.c | 6 + 21 files changed, 1461 insertions(+), 9 deletions(-) create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-autocomplete.cfg create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-autocomplete.xml create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-nodistances.cfg create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-nodistances.xml create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-partialdist.cfg create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-partialdist.xml create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma.cfg create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma.xml -- 2.13.6 -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list