Cc'ing Vitaly.
On 26/10/23 07:49, Eiichi Tsukata wrote:
Hi all,
Here is additional details on the issue.
We've found this issue when testing Windows Virtual Secure Mode (VSM) VMs.
We sometimes saw live migration failures of VSM-enabled VMs. It turned
out that the issue happens during live migration when VMs change boot related
EFI variables (ex: BootOrder, Boot0001).
After some debugging, I've found the race I mentioned in the commit message.
Symptom
=======
When it happnes with the latest Qemu which has commit https://github.com/qemu/qemu/commit/7191f24c7fcfbc1216d09
Qemu shows the following error message on destination.
qemu-system-x86_64: Failed to put registers after init: Invalid argument
If it happens with older Qemu which doesn't have the commit, then we see CPU dump something like this:
KVM internal error. Suberror: 3
extra data[0]: 0x0000000080000b0e
extra data[1]: 0x0000000000000031
extra data[2]: 0x0000000000000683
extra data[3]: 0x000000007f809000
extra data[4]: 0x0000000000000026
RAX=0000000000000000 RBX=0000000000000000 RCX=0000000000000000 RDX=0000000000000f61
RSI=0000000000000000 RDI=0000000000000000 RBP=0000000000000000 RSP=0000000000000000
R8 =0000000000000000 R9 =0000000000000000 R10=0000000000000000 R11=0000000000000000
R12=0000000000000000 R13=0000000000000000 R14=0000000000000000 R15=0000000000000000
RIP=000000000000fff0 RFL=00010002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
CS =0038 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
DS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
FS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
GS =0020 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA]
LDT=0000 0000000000000000 ffffffff 00c00000
TR =0040 000000007f7df050 00068fff 00808b00 DPL=0 TSS64-busy
GDT= 000000007f7df000 0000004f
IDT= 000000007f836000 000001ff
CR0=80010033 CR2=000000000000fff0 CR3=000000007f809000 CR4=00000668
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d00
Code=?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? <??> ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
In the above dump, CR3 is pointing to SMRAM region though SMM=0.
Repro
=====
Repro step is pretty simple.
* Run SMM enabled Linux guest with secure boot enabled OVMF.
* Run the following script in the guest.
/usr/libexec/qemu-kvm &
while true
do
efibootmgr -n 1
done
* Do live migration
On my environment, live migration fails in 20%.
VMX specific
============
This issue is VMX sepcific and SVM is not affected as the validation
in svm_set_nested_state() is a bit different from VMX one.
VMX:
static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
struct kvm_nested_state __user *user_kvm_nested_state,
struct kvm_nested_state *kvm_state)
{
.. /* * SMM temporarily disables VMX, so we cannot be in guest mode,
* nor can VMLAUNCH/VMRESUME be pending. Outside SMM, SMM flags
* must be zero.
*/ if (is_smm(vcpu) ?
(kvm_state->flags &
(KVM_STATE_NESTED_GUEST_MODE | KVM_STATE_NESTED_RUN_PENDING))
: kvm_state->hdr.vmx.smm.flags)
return -EINVAL;
..
SVM:
static int svm_set_nested_state(struct kvm_vcpu *vcpu,
struct kvm_nested_state __user *user_kvm_nested_state,
struct kvm_nested_state *kvm_state)
{
.. /* SMM temporarily disables SVM, so we cannot be in guest mode. */ if (is_smm(vcpu) && (kvm_state->flags & KVM_STATE_NESTED_GUEST_MODE))
return -EINVAL;
..
Thanks,
Eiichi
On Oct 26, 2023, at 14:42, Eiichi Tsukata <eiichi.tsukata@xxxxxxxxxxx> wrote:
kvm_put_vcpu_events() needs to be called before kvm_put_nested_state()
because vCPU's hflag is referred in KVM vmx_get_nested_state()
validation. Otherwise kvm_put_nested_state() can fail with -EINVAL when
a vCPU is in VMX operation and enters SMM mode. This leads to live
migration failure.
Signed-off-by: Eiichi Tsukata <eiichi.tsukata@xxxxxxxxxxx>
---
target/i386/kvm/kvm.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index e7c054cc16..cd635c9142 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -4741,6 +4741,15 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
return ret;
}
+ /*
+ * must be before kvm_put_nested_state so that HF_SMM_MASK is set during
+ * SMM.
+ */
+ ret = kvm_put_vcpu_events(x86_cpu, level);
+ if (ret < 0) {
+ return ret;
+ }
+
if (level >= KVM_PUT_RESET_STATE) {
ret = kvm_put_nested_state(x86_cpu);
if (ret < 0) {
@@ -4787,10 +4796,6 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
if (ret < 0) {
return ret;
}
- ret = kvm_put_vcpu_events(x86_cpu, level);
- if (ret < 0) {
- return ret;
- }
if (level >= KVM_PUT_RESET_STATE) {
ret = kvm_put_mp_state(x86_cpu);
if (ret < 0) {
--
2.41.0