On 1/11/21 2:56 PM, Andrea Bolognani wrote:
On Wed, 2021-01-06 at 12:50 -0300, Daniel Henrique Barboza wrote:
On 1/6/21 12:03 PM, Daniel P. Berrangé wrote:
virDomainCreateXML on the source Libvirt 7.0.0 on the source will set
PARSE_ABI_UPDATE and thus set the new smaller RAM size.
Now we live migrate to libvirt 6.9.0 on dest host, and that will not
set PARSE_ABI_UPDATE and thus set the larger RAM size.
In this scenario, yes, the memory modules on the destination will not be aligned in
PostParse time, but they'll not be used to calculate initialmem/totalmem again
because we don't align memory during live migration.
Since we now perform memory alignment earlier, we will reflect the
aligned size back to the XML instead of re-aligning it at command
line generation time every single time the VM is started.
That's correct.
So the XML that we're going to send to the migration destination will
contain sizes that are already aligned, and thus further alignment
should not be necessary.
In fact, the destination will not realign the memory at any circunstance, period.
If a miscalculation happens in the source and the memory ends up unaligned,
the destination will have to deal with it.
At least, that's the theory :)
Daniel, did you test migration between libvirt 7.0.0 and earlier
releases? Can you confirm it works?
Just tested a migration scenario between 2 hosts (a Power 9 Boston running Libvirt
6.8.0 and a Power 9 Mihawk running Libvirt 7.0.0), with a domain that reproduces
the bug.
TL;DR: migration works both ways without issues. Details below if interested:
1) created a common domain XML with the following format:
(...)
<maxMemory slots='16' unit='KiB'>4194304</maxMemory>
<memory unit='KiB'>2097152</memory>
<currentMemory unit='KiB'>2097152</currentMemory>
(...)
<memory model='dimm'>
<target>
<size unit='KiB'>323264</size>
</target>
<address type='dimm' slot='0'/>
</memory>
(...)
The dimm is left unaligned on purpose to demonstrate how an older Libvirt behaves
vs 7.0.0.
2) In a Power 9 Boston machine running Libvirt 6.8.0, I created a 'vm1_6.8.0' domain
using the XML above. Since there is no postparse aignment in this version, the domain
was defined with the unaligned dimm.
3) Started the VM and checked the QEMU command line:
[danielhb@ltc-boston118 build]$ sudo ./run tools/virsh start vm1_6.8.0
Domain vm1_6.8.0 started
[danielhb@ltc-boston118 build]$ sudo tail -n 30 ~/libvirt_build/var/log/libvirt/qemu/vm1_6.8.0.log
(...)
-cpu POWER9 \
-m size=1835008k,slots=16,maxmem=4194304k \
(...)
-object memory-backend-ram,id=memdimm0,size=536870912,host-nodes=8,policy=bind \
-device pc-dimm,memdev=memdimm0,id=dimm0,slot=0 \
-uuid f3545d9d-f8e6-4569-9e10-ade357b28163 \
(...)
Note that "-m" plus the dimm is more that 2Gb size, which is the bug I fixed in Libvirt 7.0.0
for new domains.
4) Migrating the vm1_6.8.0 domain to a Power 9 Mihawk server, running Libvirt 7.0.0:
$ sudo ./run tools/virsh -c 'qemu:///system' migrate --live --domain vm1_6.8.0 \
--desturi qemu+ssh://ltcmihawk39/system --timeout 60
$
5) In the Mihawk host:
[danielhb@ltcmihawk39 build]$ sudo ./run tools/virsh list --all
[sudo] password for danielhb:
Id Name State
----------------------------
1 vm1_6.8.0 running
[danielhb@ltcmihawk39 build]$ sudo tail -n 30 ~/libvirt_build/var/log/libvirt/qemu/vm1_6.8.0.log
(...)
-cpu POWER9 \
-m size=1835008k,slots=16,maxmem=4194304k \
(...)
-object memory-backend-ram,id=memdimm0,size=536870912,host-nodes=8,policy=bind \
-device pc-dimm,memdev=memdimm0,id=dimm0,slot=0,addr=2147483648 \
-uuid f3545d9d-f8e6-4569-9e10-ade357b28163 \
(...)
As predicted, there is no recalculation of memory alignment in the destination. This
is the same QEMU command line as in (3).
6) Going the other way around, I defined a vm1_7.0.0 domain using the domain XML described
in (1). Since it's a new domain running Libvirt 7.0.0, the dimm were aligned in the
PostParse callback:
[danielhb@ltcmihawk39 build]$ sudo ./run tools/virsh dumpxml vm1_7.0.0
(...)
<domain type='kvm'>
<name>vm1_7.0.0</name>
<uuid>6a45986d-68c4-4296-afa9-83df4e6ea6cd</uuid>
<maxMemory slots='16' unit='KiB'>4194304</maxMemory>
<memory unit='KiB'>2097152</memory>
<currentMemory unit='KiB'>2097152</currentMemory>
(...)
<memory model='dimm'>
<target>
<size unit='KiB'>524288</size>
</target>
<address type='dimm' slot='0'/>
</memory>
(...)
7) Started the domain and checked QEMU arguments:
[danielhb@ltcmihawk39 build]$ sudo ./run tools/virsh start vm1_7.0.0
Domain 'vm1_7.0.0' started
[danielhb@ltcmihawk39 build]$ sudo tail -n 30 ~/libvirt_build/var/log/libvirt/qemu/vm1_7.0.0.log
(...)
-cpu POWER9 \
-m size=1572864k,slots=16,maxmem=4194304k \
(...)
-object memory-backend-ram,id=memdimm0,size=536870912,host-nodes=8,policy=bind \
-device pc-dimm,memdev=memdimm0,id=dimm0,slot=0 \
-uuid 6a45986d-68c4-4296-afa9-83df4e6ea6cd \
(...)
Here, we can see that this QEMU is indeed using the intended 2Gb of RAM instead of
1835008k+ 536870912 like in (3) with an older Libvirt.
8) Migrate it to the Boston server with Libvirt 6.8.0 and check if all went according to plan:
[danielhb@ltcmihawk39 build]$ sudo ./run tools/virsh -c 'qemu:///system' migrate --live \
--domain vm1_7.0.0 --desturi qemu+ssh://ltc-boston118/system --timeout 60
[danielhb@ltc-boston118 build]$ sudo ./run tools/virsh list --all
[sudo] password for danielhb:
Id Name State
----------------------------
4 vm1_7.0.0 running
- vm1_6.8.0 shut off
[danielhb@ltc-boston118 build]$ sudo tail -n 30 ~/libvirt_build/var/log/libvirt/qemu/vm1_7.0.0.log
(...)
-cpu POWER9 \
-m size=1572864k,slots=16,maxmem=4194304k \
(...)
-object memory-backend-ram,id=memdimm0,size=536870912,host-nodes=8,policy=bind \
-device pc-dimm,memdev=memdimm0,id=dimm0,slot=0,addr=2147483648 \
-uuid 6a45986d-68c4-4296-afa9-83df4e6ea6cd \
(...)
We can see that the domain was migrated using the memory sizes from the source host,
as intended.
Thanks,
DHB