On 6/2/2022 8:42 AM, Igor Mammedov wrote:
On Tue, 31 May 2022 13:00:07 -0400
mike tancsa <mike@xxxxxxxxxx> wrote:
Hello,
I have been using kvm since the Ubuntu 18 and 20.x LTS series of
kernels and distributions without any issues on a whole range of Guests
up until now. Recently, we spun up an Ubuntu LTS 22 hypervisor to add to
the mix and eventually upgrade to. Hardware is a series of Ryzen 7 CPUs
(3700x). Migrations back and forth without issue for Ubuntu 20.x
kernels. The first Ubuntu 22 machine was on identical hardware and all
was good with that too. The second Ubuntu 22 based machine was spun up
with a newer gen Ryzen, a 5800x. On the initial kernel version that
came with that release back in April, migrations worked as expected
between hardware as well as different kernel versions and qemu / KVM
versions that come default with the distribution. Not sure if migrations
between kernel and KVM versions "accidentally" worked all these years,
but they did. However, we ran into an issue with the kernel
5.15.0-33-generic (possibly with 5.15.0-30 as well) thats part of
Ubuntu. Migrations no longer worked to older generation CPUs. I could
send a guest TO the box and all was fine, but upon sending the guest to
another hypervisor, the sender would see it as successfully migrated,
but the VM would typically just hang, with 100% CPU utilization, or
sometimes crash. I tried a 5.18 kernel from May 22nd and again the
behavior is different. If I specify the CPU as EPYC or EPYC-IBPB, I can
migrate back and forth.
perhaps you are hitting issue fixed by:
https://lore.kernel.org/lkml/CAJ6HWG66HZ7raAa+YK0UOGLF+4O3JnzbZ+a-0j8GNixOhLk9dA@xxxxxxxxxxxxxx/T/
Thanks for the response. I am not sure. That patch is from Feb. Would
the bug have been introduced sometime in May to the 5.15 kernel than
Ubuntu 22 would have tracked ?
Looking at the CPU flags diff between the 5800 and the 3700,
diff -u 3700x 5800x
--- 3700x 2022-06-02 14:57:00.331309878 +0000
+++ 5800x 2022-06-02 14:56:52.403340136 +0000
@@ -77,6 +77,7 @@
hw_pstate
ssbd
mba
+ibrs
ibpb
stibp
vmmcall
@@ -85,6 +86,8 @@
avx2
smep
bmi2
+erms
+invpcid
cqm
rdt_a
rdseed
@@ -122,13 +125,15 @@
vgif
v_spec_ctrl
umip
+pku
+ospke
+vaes
+vpclmulqdq
rdpid
overflow_recov
succor
smca
-sme
-sev
-sev_es
+fsrm
bugs
sysret_ss_attrs
spectre_v1
Quick summary
On Ubuntu 20.04 LTS with latest Ubuntu updates, I can migrate VMs back
and forth between a 3700x and a 5800x without issue. Guests are a mix of
Ubuntu, Fedora and FreeBSD
On Ubuntu 22 LTS, with the original kernel from release day, I can
migrate VMs back and forth between a 3700x and a 5800x without issue
On Ubuntu 22 LTS with everything up to date as of mid May 2022, I can
migrate from the 3700X to the 5800x without issue. But going from the
5800x to the 3700x results in a migrated VM that either crashes inside
the VM or has the CPU pegged at 100% spinning its wheels with the guest
frozen and needing a hard reset. This is with --live or without and with
--unsafe or without. The crash / hang happens once the VM is fully
migrated with the sender thinking it was successfully sent and the
receiver thinking it successfully arrived in.
On stock Ubuntu 22 (5.15.0-33-generic) I can migrate back and forth to
Ubuntu 20 as long as the hardware / cpu is identical (in this case, 3700x)
On Ubuntu 22 LTS with everything up to date as of mid May 2022 with
5.18.0-051800-generic #202205222030 SMP PREEMPT_DYNAMIC Sun May 22. I
can migrate VMs back and forth that have as its CPU def EPYC or
EPYC-IBPB. If the def (in my one test case anyways) is Nehalem then I
get a frozen VM on migration back to the 3700X.
Some more details at
https://ubuntuforums.org/showthread.php?t=2475399
Is this a bug ? Expected behavior ? Is there a better place to ask
these questions ?
Thanks in advance!
---Mike