Re: [PATCH] qemu: don't setup cpuset.mems if memory mode in numatune is 'preferred'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 07, 2014 at 05:36:43PM +0800, Wang Rui wrote:
On 2014/11/5 16:07, Martin Kletzander wrote:
[...]
diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c
index b5bdb36..8685d6f 100644
--- a/src/qemu/qemu_cgroup.c
+++ b/src/qemu/qemu_cgroup.c
@@ -618,6 +618,11 @@ qemuSetupCpusetMems(virDomainObjPtr vm,
    if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET))
        return 0;

+    if (virDomainNumatuneGetMode(vm->def->numatune, -1) !=
+        VIR_DOMAIN_NUMATUNE_MEM_STRICT) {
+        return 0;
+    }
+

One question, is it problem only for 'preferred' or 'interleaved' as
well?  Because if it's only problem for 'preferred', then the check is
wrong.  If it's problem for 'interleaved' as well, then the commit
message is wrong.

'interleave' with a single node(such as nodeset='0') will cause the same error.
But 'interleave' mode should not live with a single node. So maybe there's
another bugfix to check 'interleave' with single node.


Well, I'd be OK with just changing the commit message to mention that.
This fix is still a valid one and will fix both issues, won't it?

If configured with 'interleave' and multiple nodes(such as nodeset='0-1'),
VM can be started successfully. And cpuset.mems is set to the same nodeset.
So I'll revise my patch.

I'll send patches V2. Conclusion:

1/3 : add check for 'interleave' mode with single numa node
2/3 : fix this problem in qemu
3/3 : fix this problem in lxc

Is it OK?

Anyway, after either one is fixed, I can push this.


I tested this problem again and found that this error occurred with each
memory mode. It is broke by commit 411cea638f6ec8503b7142a31e58b1cd85dbeaba
which is produced by me.
   qemu: move setting emulatorpin ahead of monitor showing up

I'm sorry for that.

That patch moved qemuSetupCgroupForEmulator before qemuSetupCgroupPostInit.

I have ideas to fix that.

1. Move qemuSetupCgroupPostInit ahead of monitor showing up, too.
  Of course it's before qemuSetupCgroupForEmulator.
  This action to fix the bug which is introduced by me.
  (RFC)


That cannot be done, IIRC, because we need monitor to get the
vCPU <-> thread mapping from it.

2. Anyway the first problem is fixed, I have found the second problem which
  is I wanted to fix originally. If memory mode is 'preferred' and with
  one node (such as nodeset='0'), domain's memory is not in node 0
  absolutely. Assumption that node 0 doesn't have enough memory, memory
  can be allocated on node 1. Then if we set cpuset.mems to '0', it may
  cause OOM.
  The solution is checking memory mode in (lxc)qemuSetupCpusetMems as my
  patch on Tuesday.  Such as

  +    if (virDomainNumatuneGetMode(vm->def->numatune, -1) !=
  +        VIR_DOMAIN_NUMATUNE_MEM_PREFERRED) {


Either this (as it makes sense to restrict qemu even for 'interleave'
or the previous check is fine too (just because that was what we did
before, I just rewrote it with few problems.

BTW:
3. After the first problem has been fixed, we can start domains with xml:
 <numatune>
   <memory mode='interleave' nodeset='0'/>
 </numatune>

 Is a single node '0' valid for 'interleave' ? I take 'interleave' as
 'at least two nodes'.


Well, interleave of 1 node is effectively 'strict', isn't it?  What
errors do you get if you try that?  (my kernel stopped accepting
numa=fake=2 as a cmdline parameter :( )

Anyway, I think the best way would be mimicking the old behaviour by
just adding your first proposed fix "if (mode != STRICT) return 0",
just fit the fixed up comit message.

Martin

Attachment: signature.asc
Description: Digital signature

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list

[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]