Re: Extending libvirt to probe NUMA topology

Daniel Veillard <veillard@xxxxxxxxxx> · Wed, 13 Jun 2007 13:48:21 -0400

On Wed, Jun 13, 2007 at 10:40:40AM -0500, Ryan Harper wrote:
> Hello all,

  Hello Ryan,

> I wanted to start a discussion on how we might get libvirt to be able to
> probe the NUMA topology of Xen and Linux (for QEMU/KVM).  In Xen, I've
> recently posted patches for exporting topology into the [1]physinfo
> hypercall, as well adding a [2]hypercall to probe the Xen heap.  I
> believe the topology and memory info is already available in Linux.
> With these, we have enough information to be able to write some simple
> policy above libvirt that can create guests in a NUMA-aware fashion.
> 
> I'd like to suggest the following for discussion:
> 
> (1) A function to discover topology
> (2) A function to check available memory
> (3) Specifying which cpus to use prior to domain start

 Okay, but let's start by defining the scope a bit. Historically NUMA
have explored various paths, and I assume we are gonna work in a rather
small subset of what NUMA (Non Uniform Memory Access) have meant over time.

 I assume the following, tell me if I'm wrong:
   - we are just considering memory and processor affinity
   - the topology, i.e. the affinity between the processors and the various
     memory areas is fixed and the kind of mapping is rather simple

to get into more specifics:
   - we will need to expand the model of libvirt http://libvirt.org/intro.html
     to split the Node ressources into separate sets containing processors
     and memory areas which are highly connected together (assuming the 
     model is a simple partition of the ressources between the equivalent 
     of sub-Nodes)
   - the function (2) would for a given processor tell how much of its memory
     is already allocated (to existing running or paused domains)

Right ? Is the partition model sufficient for the architectures ?
If yes then we will need a new definition and terminology for those sub-Nodes.

For 3 we already have support for pinning the domain virtual CPUs to physical
CPUs but I guess it's not sufficient because you want this to be activated
from the definition of the domain:

  http://libvirt.org/html/libvirt-libvirt.html#virDomainPinVcpu

So the XML format would have to be extended to allow specifying the subset
of processors the domain is supposed to start on:

  http://libvirt.org/format.html

I would assume that if nothing is specified, the underlying Hypervisor
(in libvirt terminology, that could be a linux kernel in practice) will
by default try to do the optimal placement by itself, i.e. (3) is only 
useful if you want to override the default behaviour.

  Please correct me if I'm wrong,

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard@xxxxxxxxxx  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/