Sorry for the super slow review, I was dreading looking at KVM's SMM mess :-) On Wed, May 01, 2024, Kishen Maloor wrote: > diff --git a/arch/x86/kvm/smm.c b/arch/x86/kvm/smm.c > index d06d43d8d2aa..b1dac967f1a5 100644 > --- a/arch/x86/kvm/smm.c > +++ b/arch/x86/kvm/smm.c > @@ -633,8 +633,16 @@ int emulator_leave_smm(struct x86_emulate_ctxt *ctxt) > > #ifdef CONFIG_X86_64 > if (guest_cpuid_has(vcpu, X86_FEATURE_LM)) > - return rsm_load_state_64(ctxt, &smram.smram64); > + ret = rsm_load_state_64(ctxt, &smram.smram64); > else > #endif > - return rsm_load_state_32(ctxt, &smram.smram32); > + ret = rsm_load_state_32(ctxt, &smram.smram32); > + > + /* > + * Set nested_run_pending to ensure completion of a nested VM-Entry > + */ > + if (ret == X86EMUL_CONTINUE && ctxt->ops->is_guest_mode(ctxt)) No need to bounce through the emulation context, this can simply be if (ret == X86EMUL_CONTINUE && is_guest_mode(vcpu)) > + vcpu->arch.nested_run_pending = 1; That said, while I 100% agree that KVM shouldn't set nested_run_pending before loading state from SMRAM, I think it's actually the wrong fix. I am fairly certain that the real issue is that KVM is synthesizing shutdown _for L2_. Neither the the SDM and APM state that a shutdown on RSM to guest mode can trigger a VM-Exit, Intel's pseudocode explicitly says state is loaded from SMRAM before transitioning the CPU back to non-root mode, and AMD saves VMCB state in SMRAM, i.e. it's not treated differently (like Intel does for VMX state). So, the more correct, and amusingly easier, fix is to forcibly leave nested mode prior to signalling shutdown. I'll post a patch for that. I think I'll also grab patch 1 and post it in a slightly larger cleanup series at some point. Making nested_run_pending a common field is worthwhile regardless of this SMM mess.