Hi libvirt people, I've been looking at a (probable) bug and I'm not sure how to progress. The situation is a bit complicated and involves both QEMU and libvirt (and I think it may have been looked at already) so I would really appreciate some advice on how to approach it. I'm using a pretty recent master version of libvirt from git and I'm testing on a ppc64le host with a similar guest but this doesn't seem to be arch-specific. If I create a QEMU guest (e.g. via virt-install) that requests both hugepage backing on the host and NUMA memory placement on the host, the NUMA placement seems to be ignored. If I do: # echo 0 > /proc/sys/vm/nr_hugepages # echo 512 > /sys/devices/system/node/node0/hugepages/hugepages-16384kB/nr_hugepages # virt-install --name tmp --memory=4096 --graphics none --memorybacking hugepages=yes --disk none --import --wait 0 --numatune=8 ... then hugepages are allocated on node 0 and the machine starts successfully, which seems like a bug. I believe it should fail to start due to insufficient memory, and in fact that is what happens if cgroup support isn't detected in the host: there seems to be a fall-back path in libvirt (probably using mbind()) that works as I would expect. Note: the relevant part of the guest XML seems to be this: »·······<memoryBacking> »·······»·······<hugepages/> »·······</memoryBacking> »·······<numatune> »·······»·······<memory mode='strict' nodeset='8'/> »·······</numatune> It seems fairly clear what is happening: although QEMU is capable of allocating hugepages on specific NUMA nodes (using "memory-backend-file") libvirt is not passing those options to QEMU in this situation. I investigated this line of reasoning and if I hack libvirt to pass those options to QEMU it does indeed fix the problem... but it renders the machine state migration-incompatible with unfixed versions. This seems to have been why this hasn't been fixed already :-( So what can we do? I assume it's not acceptible to just break migration with a bugfix, and I can only think of two ways to fix migration: (a) Add a new flag to the XML, and for guests without the flag, maintain the old buggy behaviour (and therefore migration compatability). (b) Hack QEMU so that migration can succeed between un-fixed and fixed versions. (And possibly also in the reverse direction?) I don't like (a) because it's visible in the XML, and would have to be carried forever (or at least a long time?). I don't really like (b) either because it's tricky, and even if it could be made to work reliably, it would add mess and risk to the migration code. I'm not sure how the QEMU community would feel about it either. However, I did hack up some code and it worked at least in some simple cases. Can anyone see a better approach? Is anyone already working on this? Thanks, Sam. -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list