Windows Server 2016 with Hyper-V enabled fails to boot on OVMF with SMM (OVMF_CODE-need-smm.fd). Turns out that the SMM emulation code in KVM does not handle nested virtualization very well, leading to a whole bunch of issues. For example, Hyper-V uses descriptor table exiting (SECONDARY_EXEC_DESC) so when the SMM handler tries to switch from real mode a VM exit occurs and is forwarded to a clueless L1. This series fixes it by switching the vcpu to !guest_mode, i.e. to the L1 state, before entering SMM and then switching back to L2 as part of emulating the RSM instruction. Patches 1 and 2 are common for both Intel and AMD, patches 3-4 fix Intel, and patches 5-6 AMD. v1->v2: * Moved left_smm detection to emulator_set_hflags (couldn't quite get rid of the field despite my original claim) (Paolo) * Moved the kvm_x86_ops->post_leave_smm() call a few statements down so it really runs after all state has been synced. * Added the smi_allowed callback (new patch 2) to avoid running into WARN_ON_ONCE(vmx->nested.nested_run_pending) on Intel. v2->v3: * Ommitted patch 4 ("KVM: nVMX: save nested EPT information in SMRAM state save map") and replaced it with ("treat CR4.VMXE as reserved in SMM") (Paolo) * Implemented smi_allowed on AMD to support SMI interception. Turns out Windows needs this when running on >1 vCPU. * Eliminated internal SMM state on AMD and switched to using the SMM state save area in guest memory instead (Paolo) v3->v4: * Changed the order of operations in enter_smm(), now saving the original (and potentially L2) state into the SMM state save area. * Made em_rsm() reload the SMM state save area if post_leave_smm() entered guest mode. This way, SMM handlers see and may change the actual state of the vCPU at the point where SMI was injected (Radim) * In patch 4, switched to a different way of avoiding the problem of hitting the very check the patch is adding. v4->v5: * Removed patch 4 (CR4.VMXE protection in SMM, will be done separately), patch 3 bacame 4 and new patch 3 fixes a bug in load_vmcs12_host_state() which prevented SMM exit to L2 from working without first restoring the state from the SMM state save area. * Eliminated the first restore from SMM state save area (Paolo) * Tweaked the HF_SMM_MASK flag manipulation (Paolo) Ladi Prosek (6): KVM: x86: introduce ISA specific SMM entry/exit callbacks KVM: x86: introduce ISA specific smi_allowed callback KVM: nVMX: set IDTR and GDTR limits when loading L1 host state KVM: nVMX: fix SMI injection in guest mode KVM: nSVM: refactor nested_svm_vmrun KVM: nSVM: fix SMI injection in guest mode arch/x86/include/asm/kvm_emulate.h | 2 + arch/x86/include/asm/kvm_host.h | 7 ++ arch/x86/kvm/emulate.c | 9 ++ arch/x86/kvm/svm.c | 205 +++++++++++++++++++++++++------------ arch/x86/kvm/vmx.c | 79 ++++++++++++-- arch/x86/kvm/x86.c | 20 +++- 6 files changed, 245 insertions(+), 77 deletions(-)