On Tue, Feb 10, 2015 at 1:14 AM, Michal Privoznik <mprivozn@xxxxxxxxxx> wrote: > On 09.02.2015 18:19, G. Richard Bellamy wrote: >> First I'll quickly summarize my understanding of how to configure numa... >> >> In "//memoryBacking/hugepages/page[@nodeset]" I am telling libvirt to >> use hugepages for the guest, and to get those hugepages from a >> particular host NUMA node. > > No, @nodeset refers to guest NUMA nodes. > >> >> In "//numatune/memory[@nodeset]" I am telling libvirt to pin the >> memory allocation to the guest from a particular host numa node. > > The <memory/> element tells what to do with not explicitly pinned guest > NUMA nodes. > >> In "//numatune/memnode[@nodeset]" I am telling libvirt which guest >> NUMA node (cellid) should come from which host NUMA node (nodeset). > > Correct. This way you can explicitly pin guest onto host NUMA nodes. > >> >> In "//cpu/numa/cell[@id]" I am telling libvirt how much memory to >> allocate to each guest NUMA node (cell). > > Yes. Each <cell/> creates guest NUMA node. It interconnects vCPUs and > guest memory - which vCPUs should lie in which guest NUMA node, and how > much memory should be available for that particular guest NUMA node. > >> >> Basically, I thought "nodeset", regardless of where it existed in the >> domain xml, referred to the host's NUMA node, and "cell" (<cell id=/> >> or @cellid) refers to the guest's NUMA node. >> >> However.... >> >> Atlas [1] starts without issue, prometheus [2] fails with "libvirtd[]: >> hugepages: node 2 not found". I found a patch that contains the code >> responsible for throwing this error [3], >> >> + if (def->cpu && def->cpu->ncells) { >> + /* Fortunately, we allow only guest NUMA nodes to be continuous >> + * starting from zero. */ >> + pos = def->cpu->ncells - 1; >> + } >> + >> + next_bit = virBitmapNextSetBit(page->nodemask, pos); >> + if (next_bit >= 0) { >> + virReportError(VIR_ERR_XML_DETAIL, >> + _("hugepages: node %zd not found"), >> + next_bit); >> + return -1; >> + } >> >> Without digging too deeply into the actual code, and just inferring >> from the above, it looks like we are reading the number of cells set >> in "//cpu/numa" with def->cpu->ncells, and comparing it to the number >> of nodesets in "//memoryBacking//hugepages". I think this means that I >> misunderstand what the nodeset is for in that element... >> >> Of note is the fact that my host has non-contiguous NUMA node numbers: >> 2015-02-09 08:53:06 >> root@eanna i ~ # numastat >> node0 node2 >> numa_hit 216225024 440311113 >> numa_miss 0 795018 >> numa_foreign 795018 0 >> interleave_hit 15835 15783 >> local_node 214029815 221903122 >> other_node 2195209 219203009 >> >> Thanks again for any help. >> > > Libvirt should be perfectly able to cope with noncontinuous host NUMA > nodes. However, noncontinuous guest NUMA nodes are not supported yet - > but it shouldn't matter since users have full control over creating > guest NUMA nodes. > > Anyway, if you find the documentation incomplete in any sense, any part, > or you feel that rewording some paragraphs may help, feel free to > propose a patch and I'll review it. Thanks again Michal, I'm slowly zeroing in to a good resolution here. I think the documentation is clear enough - it's the fact that a guest NUMA node can be referred to as either cell(id) or nodeset, depending on element context - that's what threw me. I've modified my config [1] based on my understanding, and am running into a new error. Basically I'm hitting the oom-killer [2] even though the hard_limit [3] of memtune is below the total number of hugepages set for that NUMA nodeset. [1] http://sprunge.us/BadI [2] http://sprunge.us/eELZ [3] http://sprunge.us/GYXM _______________________________________________ libvirt-users mailing list libvirt-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvirt-users