On Fri, Feb 16, 2024 at 11:30:09 +0000, Daniel P. Berrangé wrote: > On Wed, Feb 14, 2024 at 02:16:58PM +0100, Jiri Denemark wrote: > > On Wed, Feb 14, 2024 at 13:23:09 +0100, Tim Wiederhake wrote: > > > My knowledge about migration is limited, hence I am hesitant to make > > > factual claims. That being said, my understanding is that by requesting > > > e.g. a Skylake-Client cpu 'mpx' was never actually enabled in the VM as > > > qemu's version of the same cpu model did not include that feature. > > > > Well, if our CPU model has mpx than QEMU must have had it too at some > > point. The question is whether it was ever released. And also whether > > the feature could have ever been enabled in any running domain or trying > > to enable it caused the domain to fail to start anyway. > > QEMU CPUs originally had 'mpx', and it was later turned off by > a 4.0 machine type versions. Right, by commit ecb85fe48cacb2f8740186e81f2f38a2e02bd963. Machine types 3.1 and older will still get the CPU models with mpx enabled, if I understand the compat code correctly. I tried to analyze all possible combinations during migration and came up with the following table. To make it reasonably small I use some definitions: - "CPU def" is a relevant part of CPU definition transferred as part of domain XML during migration from source to destination host. In other words, it's describing the actual virtual CPU created by QEMU on the source host. - "$M" stands for any model affected by the change in this patch. - "+mpx" means mpx was explicitly enabled. - "-mpx" means mpx was explicitly disabled. - "old" means libvirt release without this patch applied. - "new" is libvirt with this patch applied. CPU def | src | dest | result -----------|------|------|---------------------------------------------- anything | old | old |\ no issue, both side share anything | new | new |/ the same definition of $m ------------------------------------------------------------------------ $M +mpx | old | new |\ the definition of $M is irrelevant because $M +mpx | new | old | \ mpx is explicitly disabled or enabled; $M -mpx | old | new | / the destination knows how the virtual CPU $M -mpx | new | old |/ looks like ------------------------------------------------------------------------ $M | old | new | migration gets aborted, see below the table ------------------------------------------------------------------------ $M | new | old | this will never happen; mpx is marked as | | | "removed" in libvirt's CPU map, which means | | | new libvirt will always explicitly mention | | | the state of mpx in the definition ------------------------------------------------------------------------ The only problematic case is when migrating a domain with $M without explicitly disabling or enabling mpx from an old libvirt to the new one. In case QEMU actually enables mpx when asked for $M, i.e., for machine types 3.1 and older running on a host that support mpx, libvirt thinks $M already contains mpx and does not explicitly mention it in the XML. The new libvirt instructs QEMU to start a machine with CPU model $M (without explicitly mentioning mpx) and the same old machine type as used on the source. So QEMU on the destination host uses the compatibility code which adds mpx to $M and enables it. But the new libvirt on the destination host has mpx already removed from the CPU model $M and it will complain about unexpected mpx=on when checking what features were enabled or disabled by QEMU and aborts the migration. That said, we can only safely remove CPU features from existing (released) CPU models if the problematic case can never happen. For example if the removed feature is completely broken and it's impossible to start a domain using this feature. Or when QEMU never enables the feature without being explicitly asked to do so. On the other hand, while thinking about all this I got an idea how we could make removing features safe even in the problematic case. But I first need to think about it more. I'll either send a patch for it after I'm done or I'll reply here that the idea was wrong :-) Jirka _______________________________________________ Devel mailing list -- devel@xxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx