Re: Extending libvirt to probe NUMA topology

Daniel Veillard <veillard@xxxxxxxxxx> · Thu, 6 Sep 2007 11:17:57 -0400

On Thu, Sep 06, 2007 at 03:40:23PM +0100, Richard W.M. Jones wrote:
> Daniel Veillard wrote:
> >1) Provide a function describing the topology as an XML instance:
> >
> >   char *	virNodeGetTopology(virConnectPtr conn);
> 
> >which would return an XML instance as in virConnectGetCapabilities. I
> >toyed with the idea of extending virConnectGetCapabilities() to add a
> >topology section in case of NUMA support at the hypervisor level, but
> >it was looking to me that the two might be used at different times
> >and separating both might be a bit cleaner, but I could be convinced
> >otherwise.
> 
> I'd definitely prefer to extend virConnectGetCapabilities XML.  It 
> avoids changing the remote driver and language bindings, and really 
> callers only need to pull capabilities once per connection.

  yeah, I understand that concern, simplifies a lot of stuff inside, but
the goal at the library level is to simplify the user code even if that
means a more complex implementation. However if people think they don't
need a separate call then I'm really fine with this.

> >---------------------------------
> ><topology>
> >  <cells num='2'>
> >    <cell id='0'>
> >      <cpus num='2'>
> >        <cpu id='0'/>
> >        <cpu id='1'/>
> >      </cpus>
> >      <memory size='2097152'/>
> >    </cell>
> >    <cell id='1'>
> >      <cpus num='2'>
> >        <cpu id='2'/>
> >        <cpu id='3'/>
> >      </cpus>
> >      <memory size='2097152'/>
> >    </cell>
> >  </cells>
> ></topology>
> >---------------------------------
> >
> >  A few things to note:
> >   - the <cells> element list the top sibling cells
> 
> Not <nodes>?

  A Node in libvirt terminology is a single physical machine, cell is
a weel accepted term I think for a sub-node within a NUMA box.

> >   - the <cell> element describes as child the resources available
> >     like the list of CPUs, the size of the local memory, that could
> >     be extended by disk descriptions too
> >     <disk dev='/dev/sdb'/>
> >     and possibly other special devices (no idea what ATM).
> >
> >   - in case of deeper hierarchical topology one may need to be able to
> >     name sub-cells and the format could be extended for example as
> >     <cells num='2'>
> >       <cells num='2'>
> >         <cell id='1'>
> >           ...
> >         </cell>
> >         <cell id='2'>
> >           ...
> >         </cell>
> >       </cells>
> >       <cells num='2'>
> >         <cell id='3'>
> >           ...
> >         </cell>
> >         <cell id='4'>
> >           ...
> >         </cell>
> >       </cells>
> >     </cells>
> >     But that can be discussed/changed when the need arise :-)
> 
> Especially note that 4 (or more) socket AMDs have a topology like this, 
> with two different penalties for reaching nodes which are one and two 
> hops away.  Do we have a way to describe the penalties along different 
> paths?

  As hinted in my mail, I think the access costs will have to be added
separately and probably as a array map, unless people come with a more 
intelligent way of exposing those informations.

> >2) Function to get the free memory of a given cell:
> >
> >   unsigned long virNodeGetCellFreeMemory(virConnectPtr conn, int cell);
> >
> >that's relatively simple, would match the request from the initial mail
> >but I'm wondering a bit. If the program tries to do a best placement it
> >will usually run that request for a number of cells no ? Maybe a call
> >returning the memory amounts for a range of cells would be more 
> >appropriate.
> 
> Yes, I guess they'd want to get the free memory for all nodes.  But IBM 
> will have a better idea about this.

 Well I'm looking for feedback :-)

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard@xxxxxxxxxx  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/

--
Libvir-list mailing list
Libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list