The nodeinfo structure includes nodes : the number of NUMA cell, 1 for uniform mem access sockets : number of CPU socket per node cores : number of core per socket threads : number of threads per core which does not work well for NUMA topologies where each node does not consist of integral number of CPU sockets. We also have VIR_NODEINFO_MAXCPUS macro in public libvirt.h which computes maximum number of CPUs as (nodes * sockets * cores * threads). As a result, we can't just change sockets to report total number of sockets instead of sockets per node. This would probably be the easiest since I doubt anyone is using the field directly. But because of the macro, some apps might be using sockets indirectly. This patch leaves sockets to be the number of CPU sockets per node (and fixes qemu driver to comply with this) on machines where sockets can be divided by nodes. If we can't divide sockets by nodes, we behave as if there was just one NUMA node containing all sockets. Apps interested in NUMA should consult capabilities XML, which is what they probably do anyway. This way, the only case in which apps that care about NUMA may break is on machines with funky NUMA topology. And there is a chance libvirt wasn't able to start any guests on those machines anyway (although it depends on the topology, total number of CPUs and kernel version). Nothing changes at all for apps that don't care about NUMA. Notes: * Testing on 4 sockets, 12 cores each, 8 NUMA nodes Xen (RHEL-5) hypervisor with numa=on: - xm info nr_cpus : 48 nr_nodes : 8 sockets_per_node : 0 cores_per_socket : 12 threads_per_core : 1 - virsh nodeinfo CPU(s): 48 CPU socket(s): 4 Core(s) per socket: 12 Thread(s) per core: 1 NUMA cell(s): 1 - virsh capabilities /capabilities/host/topology/cells@num = 8 QEMU driver: - virsh nodeinfo CPU(s): 48 CPU socket(s): 4 Core(s) per socket: 12 Thread(s) per core: 1 NUMA cell(s): 1 - virsh capabilities /capabilities/host/topology/cells@num = 8 * 2 sockets, 4 cores each, 2 NUMA nodes Xen (RHEL-5) hypervisor with numa=on: - xm info nr_cpus : 8 nr_nodes : 2 sockets_per_node : 1 cores_per_socket : 4 threads_per_core : 1 - virsh nodeinfo CPU(s): 8 CPU socket(s): 1 Core(s) per socket: 4 Thread(s) per core: 1 NUMA cell(s): 2 - virsh capabilities /capabilities/host/topology/cells@num = 2 QEMU driver: - virsh nodeinfo CPU(s): 8 CPU socket(s): 1 Core(s) per socket: 4 Thread(s) per core: 1 NUMA cell(s): 2 - virsh capabilities /capabilities/host/topology/cells@num = 2 * uniform memory architecture, 2 sockets, 4 cores each Xen (RHEL-5) hypervisor: - xm info nr_cpus : 8 nr_nodes : 1 sockets_per_node : 2 cores_per_socket : 4 threads_per_core : 1 - virsh nodeinfo CPU(s): 8 CPU socket(s): 2 Core(s) per socket: 4 Thread(s) per core: 1 NUMA cell(s): 1 - virsh capabilities /capabilities/host/topology/cells@num = 1 Xen (upstream) hypervisor: - xm info nr_cpus : 8 nr_nodes : 1 cores_per_socket : 4 threads_per_core : 1 - virsh nodeinfo CPU(s): 8 CPU socket(s): 2 Core(s) per socket: 4 Thread(s) per core: 1 NUMA cell(s): 1 - virsh capabilities /capabilities/host/topology/cells@num = 1 QEMU driver: - virsh nodeinfo CPU(s): 8 CPU socket(s): 2 Core(s) per socket: 4 Thread(s) per core: 1 NUMA cell(s): 1 - virsh capabilities /capabilities/host/topology/cells@num = 1 --- include/libvirt/libvirt.h.in | 9 ++++++--- src/nodeinfo.c | 10 ++++++++++ src/xen/xend_internal.c | 19 ++++++++++++++----- 3 files changed, 30 insertions(+), 8 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 716f7af..395a9f8 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -219,9 +219,12 @@ struct _virNodeInfo { unsigned long memory;/* memory size in kilobytes */ unsigned int cpus; /* the number of active CPUs */ unsigned int mhz; /* expected CPU frequency */ - unsigned int nodes; /* the number of NUMA cell, 1 for uniform mem access */ - unsigned int sockets;/* number of CPU socket per node */ - unsigned int cores; /* number of core per socket */ + unsigned int nodes; /* the number of NUMA cell, 1 for unusual NUMA + topologies or uniform memory access; check + capabilities XML for the actual NUMA topology */ + unsigned int sockets;/* number of CPU sockets per node if nodes == 1, + total number of CPU sockets otherwise */ + unsigned int cores; /* number of cores per socket */ unsigned int threads;/* number of threads per core */ }; diff --git a/src/nodeinfo.c b/src/nodeinfo.c index 9be2a02..acd3188 100644 --- a/src/nodeinfo.c +++ b/src/nodeinfo.c @@ -305,6 +305,16 @@ int linuxNodeInfoCPUPopulate(FILE *cpuinfo, return -1; } + /* nodeinfo->sockets is supposed to be a number of sockets per NUMA node, + * however if NUMA nodes are not composed of whole sockets, we just lie + * about the number of NUMA nodes and force apps to check capabilities XML + * for the actual NUMA topology. + */ + if (nodeinfo->sockets % nodeinfo->nodes == 0) + nodeinfo->sockets /= nodeinfo->nodes; + else + nodeinfo->nodes = 1; + return 0; } diff --git a/src/xen/xend_internal.c b/src/xen/xend_internal.c index 4450195..6ce0c3f 100644 --- a/src/xen/xend_internal.c +++ b/src/xen/xend_internal.c @@ -2497,12 +2497,21 @@ sexpr_to_xend_node_info(const struct sexpr *root, virNodeInfoPtr info) if (procs == 0) /* Sanity check in case of Xen bugs in futures..*/ return (-1); info->sockets = nr_cpus / procs; - /* Should already be fine, but for further sanity make - * sure we have at least one socket - */ - if (info->sockets == 0) - info->sockets = 1; } + + /* On systems where NUMA nodes are not composed of whole sockets either Xen + * provided us wrong number of sockets per node or we computed the wrong + * number in the compatibility code above. In such case, we compute the + * correct number of sockets on the host, lie about the number of NUMA + * nodes, and force apps to check capabilities XML for the actual NUMA + * topology. + */ + if (info->nodes * info->sockets * info->cores * info->threads + != info->cpus) { + info->nodes = 1; + info->sockets = info->cpus / (info->cores * info->threads); + } + return (0); } -- 1.7.3.2 -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list