On Fri, Jul 10, 2020 at 07:48:26 -0400, Mark Mielke wrote: > On Fri, Jul 10, 2020 at 7:14 AM Jiri Denemark <jdenemar@xxxxxxxxxx> wrote: > > > On Sun, Jul 05, 2020 at 12:45:55 -0400, Mark Mielke wrote: > > > With 6.4.0, live migration was working fine with Qemu 5.0. After trying > > out > > > 6.5.0, migration broke with the following error: > > > > > > libvirt.libvirtError: internal error: unable to execute QEMU command > > > 'migrate': State blocked by non-migratable CPU device (invtsc flag) > > > > Could you please describe the reproducer steps? For example, was the > > domain you're trying to migrate already running when you upgrade libvirt > > or is it freshly started by the new libvirt? > > > > > The original case was: > > 1) Machine X running libvirt 6.4.0 + qemu 5.0 > 2) Machine Y running libvirt 6.5.0 + qemu 5.0 > 3) Live migration from X to Y works. Guest appears fine. > 4) Upgrade Machine X from libvirt 6.4.0 to 6.5.0 and reboot. > 5) Live migration from Y to X fails with the message shown. Oh I see, so I guess the bad default is chosen during the incoming migration to machine Y. I'll try to reproduce myself to see what's going on. > In each case, live migration was done with OpenStack Train directing > libvirt + qemu. > > > And it would be helpful to see the <cpu> element as shown by virsh > > dumpxml before you try to start the domain as well as the QEMU command > > line libvirt used to start the domain (in > > /var/log/libvirt/qemu/$VM.log). > > > > The <cpu> element looks like this: > > <cpu mode='host-passthrough' check='none'> > <topology sockets='1' dies='1' cores='4' threads='2'/> > </cpu> > > The QEMU command line is very long, and includes details I would avoid > publishing publicly unless you need them. The "-cpu" portion is just: > > -cpu host > > The QEMU command line itself is generated from libvirt, which is directed > by OpenStack Train. These are from machine X before step 3, right? Can you also share the same from machine Y before step 5? > I wasn't sure what QEMU_CAPS_CPU_MIGRATABLE represents. I initially > suspected what you are saying, but since it apparently did not work the way > I expected, I then presumed it does not work the way I expected. :-) > > Is QEMU_CAPS_CPU_MIGRATABLE only from the <cpu> element? If so, doesn't > this mean that it is not explicitly listed for host-passthrough, and this > means the check is not detecting whether it is enabled or not properly? QEMU_CAPS_CPU_MIGRATABLE comes from the QEMU capability probing. Specifically, the capability is enabled when a given QEMU binary reports 'migratable' property for the CPU object. And the capability detection tests show we should be properly detecting this capability: tests/qemucapabilitiesdata $ git grep cpu.migratable caps_2.12.0.x86_64.xml: <flag name='cpu.migratable'/> caps_3.0.0.x86_64.xml: <flag name='cpu.migratable'/> caps_3.1.0.x86_64.xml: <flag name='cpu.migratable'/> caps_4.0.0.x86_64.xml: <flag name='cpu.migratable'/> caps_4.1.0.x86_64.xml: <flag name='cpu.migratable'/> caps_4.2.0.x86_64.xml: <flag name='cpu.migratable'/> caps_5.0.0.x86_64.xml: <flag name='cpu.migratable'/> caps_5.1.0.x86_64.xml: <flag name='cpu.migratable'/> > I think it can go either way. There is also convention over configuration > as a competing principle. However, I also prefer explicit. Just, it needs > to be correct, otherwise explicit can be very bad, as it seems in my case. > :-) Of course, the explicit default must match the implicit one. Jirka