2018-12-12 14:28-0800, Jim Mattson: > Just out of curiousity, how does the SVM nested implementation deal > with the potential time-of-check to time-of-use bugs that were fixed > on the VMX side with commit 4f2777bc9797 ("kvm: x86: nVMX: maintain > internal copy of current VMCS")? It'd be best to move the checks after we load the state. SVM has a simpler model, where the guest state is checked after loading it to the processor and throws a VM exit if something is invalid. This means KVM can let the processor do the checks on most fields and our pre-check only does three things: * intercept contains INTERCEPT_VMRUN The check is kind of working because we unconditionally add INTERCEPT_VMRUN later. There is a problem with visibility of writes after the check if we get a write that clears INTERCEPT_VMRUN before a write to VMCB -- the VMRUN will succeed and the later writer will take effect. * asid != 0 Asid is never loaded, but still has the visibility problem. * nested_ctl doesn't set SVM_NESTED_CTL_NP_ENABLE when npt is disabled Looks buggy as the field could be set after the check. It creates an interesting scenario where we don't set SVM_NESTED_CTL_NP_ENABLE in nested guest, because we actually preserve the original value, but register npt handlers with nested_svm_init_mmu_context. I think the solution below could work, but my machine became a BRYCK after a reboot, so testing is going to take a while ... ---8<--- SVM first loads all guest the state and then performs consistency checks, triggering an immediate VM exit if some fail. KVM currently checks some guest state and only then loads it to parent's VMCB, which means that the loaded and checked values can differ. This poses a problem with nested->control.nested_ctl, where we could be registering NPT handlers in a situation where NPT is disabled in KVM. We also need to protect nested_svm_init_mmu_context() by npt_enabled, as we remove the flimsy condition that was there before. Reported-by: Jim Mattson <jmattson@xxxxxxxxxx> Signed-off-by: Radim Krčmář <rkrcmar@xxxxxxxxxx> --- arch/x86/kvm/svm.c | 26 +++++++++----------------- 1 file changed, 9 insertions(+), 17 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index cc6467b35a85..6db895a4d256 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -3457,12 +3457,6 @@ static void enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa, else svm->vcpu.arch.hflags &= ~HF_HIF_MASK; - if (nested_vmcb->control.nested_ctl & SVM_NESTED_CTL_NP_ENABLE) { - kvm_mmu_unload(&svm->vcpu); - svm->nested.nested_cr3 = nested_vmcb->control.nested_cr3; - nested_svm_init_mmu_context(&svm->vcpu); - } - /* Load the nested guest state */ svm->vmcb->save.es = nested_vmcb->save.es; svm->vmcb->save.cs = nested_vmcb->save.cs; @@ -3477,6 +3471,12 @@ static void enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa, if (npt_enabled) { svm->vmcb->save.cr3 = nested_vmcb->save.cr3; svm->vcpu.arch.cr3 = nested_vmcb->save.cr3; + + if (nested_vmcb->control.nested_ctl & SVM_NESTED_CTL_NP_ENABLE) { + kvm_mmu_unload(&svm->vcpu); + svm->nested.nested_cr3 = nested_vmcb->control.nested_cr3; + nested_svm_init_mmu_context(&svm->vcpu); + } } else (void)kvm_set_cr3(&svm->vcpu, nested_vmcb->save.cr3); @@ -3562,17 +3562,6 @@ static bool nested_svm_vmrun(struct vcpu_svm *svm) if (!nested_vmcb) return false; - if (!nested_vmcb_checks(nested_vmcb)) { - nested_vmcb->control.exit_code = SVM_EXIT_ERR; - nested_vmcb->control.exit_code_hi = 0; - nested_vmcb->control.exit_info_1 = 0; - nested_vmcb->control.exit_info_2 = 0; - - nested_svm_unmap(page); - - return false; - } - trace_kvm_nested_vmrun(svm->vmcb->save.rip, vmcb_gpa, nested_vmcb->save.rip, nested_vmcb->control.int_ctl, @@ -3688,6 +3677,9 @@ static int vmrun_interception(struct vcpu_svm *svm) if (!nested_svm_vmrun(svm)) return 1; + if (!nested_vmcb_checks(svm)) + goto failed; + if (!nested_svm_vmrun_msrpm(svm)) goto failed; -- 2.19.1