On Thu, Apr 01, 2021, Maxim Levitsky wrote: > Similar to the rest of guest page accesses after migration, > this should be delayed to KVM_REQ_GET_NESTED_STATE_PAGES > request. FWIW, I still object to this approach, and this patch has a plethora of issues. I'm not against deferring various state loading to KVM_RUN, but wholesale moving all of GUEST_CR3 processing without in-depth consideration of all the side effects is a really bad idea. > Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx> > --- > arch/x86/kvm/vmx/nested.c | 14 +++++++++----- > 1 file changed, 9 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > index fd334e4aa6db..b44f1f6b68db 100644 > --- a/arch/x86/kvm/vmx/nested.c > +++ b/arch/x86/kvm/vmx/nested.c > @@ -2564,11 +2564,6 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, > return -EINVAL; > } > > - /* Shadow page tables on either EPT or shadow page tables. */ > - if (nested_vmx_load_cr3(vcpu, vmcs12->guest_cr3, nested_cpu_has_ept(vmcs12), > - entry_failure_code)) > - return -EINVAL; > - > /* > * Immediately write vmcs02.GUEST_CR3. It will be propagated to vmcs12 > * on nested VM-Exit, which can occur without actually running L2 and > @@ -3109,11 +3104,16 @@ static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu) > static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu) > { > struct vmcs12 *vmcs12 = get_vmcs12(vcpu); > + enum vm_entry_failure_code entry_failure_code; > struct vcpu_vmx *vmx = to_vmx(vcpu); > struct kvm_host_map *map; > struct page *page; > u64 hpa; > > + if (nested_vmx_load_cr3(vcpu, vmcs12->guest_cr3, nested_cpu_has_ept(vmcs12), > + &entry_failure_code)) This results in KVM_RUN returning 0 without filling vcpu->run->exit_reason. Speaking from experience, debugging those types of issues is beyond painful. It also means CR3 is double loaded in the from_vmentry case. And it will cause KVM to incorrectly return NVMX_VMENTRY_KVM_INTERNAL_ERROR if a consistency check fails when nested_get_vmcs12_pages() is called on from_vmentry. E.g. run unit tests with this and it will silently disappear. diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c index bbb006a..b8ccc69 100644 --- a/x86/vmx_tests.c +++ b/x86/vmx_tests.c @@ -8172,6 +8172,16 @@ static void test_guest_segment_base_addr_fields(void) vmcs_write(GUEST_AR_ES, ar_saved); } +static void test_guest_cr3(void) +{ + u64 cr3_saved = vmcs_read(GUEST_CR3); + + vmcs_write(GUEST_CR3, -1ull); + test_guest_state("Bad CR3 fails VM-Enter", true, -1ull, "GUEST_CR3"); + + vmcs_write(GUEST_DR7, cr3_saved); +} + /* * Check that the virtual CPU checks the VMX Guest State Area as * documented in the Intel SDM. @@ -8181,6 +8191,8 @@ static void vmx_guest_state_area_test(void) vmx_set_test_stage(1); test_set_guest(guest_state_test_main); + test_guest_cr3(); + /* * The IA32_SYSENTER_ESP field and the IA32_SYSENTER_EIP field * must each contain a canonical address. > + return false; > + > if (nested_cpu_has2(vmcs12, SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) { > /* > * Translate L1 physical address to host physical > @@ -3357,6 +3357,10 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, > } > > if (from_vmentry) { > + if (nested_vmx_load_cr3(vcpu, vmcs12->guest_cr3, > + nested_cpu_has_ept(vmcs12), &entry_failure_code)) This alignment is messed up; it looks like two separate function calls. > + goto vmentry_fail_vmexit_guest_mode; > + > failed_index = nested_vmx_load_msr(vcpu, > vmcs12->vm_entry_msr_load_addr, > vmcs12->vm_entry_msr_load_count); > -- > 2.26.2 >