On Fri, Nov 07, 2014 at 05:36:43PM +0800, Wang Rui wrote:
On 2014/11/5 16:07, Martin Kletzander wrote: [...]diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c index b5bdb36..8685d6f 100644 --- a/src/qemu/qemu_cgroup.c +++ b/src/qemu/qemu_cgroup.c @@ -618,6 +618,11 @@ qemuSetupCpusetMems(virDomainObjPtr vm, if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) return 0; + if (virDomainNumatuneGetMode(vm->def->numatune, -1) != + VIR_DOMAIN_NUMATUNE_MEM_STRICT) { + return 0; + } +One question, is it problem only for 'preferred' or 'interleaved' as well? Because if it's only problem for 'preferred', then the check is wrong. If it's problem for 'interleaved' as well, then the commit message is wrong.'interleave' with a single node(such as nodeset='0') will cause the same error. But 'interleave' mode should not live with a single node. So maybe there's another bugfix to check 'interleave' with single node.Well, I'd be OK with just changing the commit message to mention that. This fix is still a valid one and will fix both issues, won't it?If configured with 'interleave' and multiple nodes(such as nodeset='0-1'), VM can be started successfully. And cpuset.mems is set to the same nodeset. So I'll revise my patch. I'll send patches V2. Conclusion: 1/3 : add check for 'interleave' mode with single numa node 2/3 : fix this problem in qemu 3/3 : fix this problem in lxc Is it OK?Anyway, after either one is fixed, I can push this.I tested this problem again and found that this error occurred with each memory mode. It is broke by commit 411cea638f6ec8503b7142a31e58b1cd85dbeaba which is produced by me. qemu: move setting emulatorpin ahead of monitor showing up I'm sorry for that. That patch moved qemuSetupCgroupForEmulator before qemuSetupCgroupPostInit. I have ideas to fix that. 1. Move qemuSetupCgroupPostInit ahead of monitor showing up, too. Of course it's before qemuSetupCgroupForEmulator. This action to fix the bug which is introduced by me. (RFC)
That cannot be done, IIRC, because we need monitor to get the vCPU <-> thread mapping from it.
2. Anyway the first problem is fixed, I have found the second problem which is I wanted to fix originally. If memory mode is 'preferred' and with one node (such as nodeset='0'), domain's memory is not in node 0 absolutely. Assumption that node 0 doesn't have enough memory, memory can be allocated on node 1. Then if we set cpuset.mems to '0', it may cause OOM. The solution is checking memory mode in (lxc)qemuSetupCpusetMems as my patch on Tuesday. Such as + if (virDomainNumatuneGetMode(vm->def->numatune, -1) != + VIR_DOMAIN_NUMATUNE_MEM_PREFERRED) {
Either this (as it makes sense to restrict qemu even for 'interleave' or the previous check is fine too (just because that was what we did before, I just rewrote it with few problems.
BTW: 3. After the first problem has been fixed, we can start domains with xml: <numatune> <memory mode='interleave' nodeset='0'/> </numatune> Is a single node '0' valid for 'interleave' ? I take 'interleave' as 'at least two nodes'.
Well, interleave of 1 node is effectively 'strict', isn't it? What errors do you get if you try that? (my kernel stopped accepting numa=fake=2 as a cmdline parameter :( ) Anyway, I think the best way would be mimicking the old behaviour by just adding your first proposed fix "if (mode != STRICT) return 0", just fit the fixed up comit message. Martin
Attachment:
signature.asc
Description: Digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list