Currently we are only able to bind the whole domain to some host nodes using the /domain/numatune/memory element. Numerous requests were made to support host<->guest numa node bindings, so this series tries to pinch an idea on how to do that using /domain/numatune/memnode elements. So here are few ideas I'd like to know others opinions on: For some reason, qemu wants to know what host nodes it can use to for allocation of the memory. While adding support for that, qemu added various memory objects (-object memory*) with different backends. There's 'memory-file' which is used for hugepages and 'memory-ram' which is used for standard allocation. Latest version of the qemu proposal is here: http://lists.gnu.org/archive/html/qemu-devel/2014-05/msg02706.html Caveats: - I'm not sure how cpu hotplug is done with guest numa nodes, but if there is a possibility to increase the number of numa nodes (which does not make sense to me from (a) user point of view and (b) our XMLs and APIs), we need to be able to hotplug the ram as well, - virDomainGetNumaParameters() now reflects only the /domain/numatune/memory settings, not 'memnode' ones, - virDomainSetNumaParameters() is not allowed when there is some /domain/numatune/memnode parameter as we can query memdev info, but not change it (if I understood the QEMU side correctly), - when domain is started, cpuset.mems cgroup is not modified per for each vcpu, this will be fixed, but the question is how to handle it for non-strict settings [*], - automatic numad placement can be now used together with memnode settings which IMHO doesn't make any sense, but I was hesitant to disable that in case somebody has a constructive criticism in this area. - This series alone is broken when used with /domain/memoryBacking/hugepages, because it will still use the memory-ram object, but that will be fixed with Michal's patches on top of this series. One idea how to solve some of the problems is to say that /domain/numatune/memory is set for the whole domain regardless of what anyone puts in /domain/numatune/memnode. virDomainGetNumaParameters() could be extended to tell the info for all guest numa nodes, although it seems new API would suit better for this kind of information. But is it really neede when we are not able to modify it live and the information is available in the domain XML? *) does (or should) this: ... <numatune> <memory mode='strict' placement='static' nodeset='0-7'/> <memnode nodeid='0' mode='preferred' nodeset='7'/> </numatune> ... mean what it looks like it means, that is "in guest node 0, prefer allocating from host node 7 but feel free to allocate from 0-6 as well in case you can't use 7, but never try allocating from host nodes 8-15"? Martin Kletzander (5): conf, schema: add 'id' field for cells conf, schema: add support for numatune memnode element qemu: purely a code movement qemu: numa capability probing qemu: pass numa node binding preferences to qemu docs/formatdomain.html.in | 29 +++- docs/schemas/domaincommon.rng | 22 +++ src/conf/cpu_conf.c | 39 ++++- src/conf/domain_conf.c | 181 +++++++++++++++++---- src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_cgroup.c | 2 + src/qemu/qemu_command.c | 160 ++++++++++++++++-- src/qemu/qemu_command.h | 3 +- src/qemu/qemu_domain.c | 23 ++- src/qemu/qemu_driver.c | 14 +- src/qemu/qemu_process.c | 3 +- src/util/virnuma.h | 14 +- tests/qemuxml2argvdata/qemuxml2argv-cpu-numa1.xml | 6 +- tests/qemuxml2argvdata/qemuxml2argv-cpu-numa2.xml | 6 +- tests/qemuxml2argvdata/qemuxml2argv-cpu-numa3.xml | 25 +++ .../qemuxml2argv-numatune-auto-prefer.args | 6 + .../qemuxml2argv-numatune-auto-prefer.xml | 29 ++++ .../qemuxml2argv-numatune-auto.args | 6 + .../qemuxml2argv-numatune-auto.xml | 26 +++ .../qemuxml2argv-numatune-memnode-nocpu.xml | 25 +++ .../qemuxml2argv-numatune-memnodes-problematic.xml | 31 ++++ .../qemuxml2argv-numatune-memnodes.args | 8 + .../qemuxml2argv-numatune-memnodes.xml | 31 ++++ .../qemuxml2argv-numatune-prefer.args | 6 + .../qemuxml2argv-numatune-prefer.xml | 29 ++++ tests/qemuxml2argvtest.c | 51 ++++-- .../qemuxml2xmlout-cpu-numa1.xml | 28 ++++ .../qemuxml2xmlout-cpu-numa2.xml | 28 ++++ tests/qemuxml2xmltest.c | 4 + tests/qemuxmlnstest.c | 2 +- 31 files changed, 747 insertions(+), 93 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-cpu-numa3.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-numatune-auto-prefer.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-numatune-auto-prefer.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-numatune-auto.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-numatune-auto.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-numatune-memnode-nocpu.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-numatune-memnodes-problematic.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-numatune-memnodes.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-numatune-memnodes.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-numatune-prefer.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-numatune-prefer.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-cpu-numa1.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-cpu-numa2.xml -- 1.9.3 -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list