Il 25/09/2013 14:21, Gleb Natapov ha scritto: > On Wed, Sep 25, 2013 at 02:08:09PM +0200, Paolo Bonzini wrote: >> Il 25/09/2013 13:51, Gleb Natapov ha scritto: >>> On Wed, Sep 25, 2013 at 01:24:49PM +0200, Paolo Bonzini wrote: >>>> Il 25/09/2013 11:51, Gleb Natapov ha scritto: >>>>> @@ -7773,6 +7787,9 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) >>>>> kvm_set_cr3(vcpu, vmcs12->guest_cr3); >>>>> kvm_mmu_reset_context(vcpu); >>>>> >>>>> + if (!enable_ept) >>>>> + vcpu->arch.walk_mmu->inject_page_fault = vmx_inject_page_fault_nested; >>>>> + >>>>> /* >>>>> * L1 may access the L2's PDPTR, so save them to construct vmcs12 >>>>> */ >>>>> @@ -8232,6 +8249,9 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, >>>>> kvm_set_cr3(vcpu, vmcs12->host_cr3); >>>>> kvm_mmu_reset_context(vcpu); >>>>> >>>>> + if (!enable_ept) >>>>> + vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault; >>>> >>>> This is strictly speaking not needed, because kvm_mmu_reset_context >>>> takes care of it. >>>> >>> Yeah, but better make it explicit, it does not hurt but make it more >>> clear what is going on. Or at least add comment above >>> kvm_mmu_reset_context() about this side effect. >> >> Yes, I agree the code is cleaner like you wrote it. >> >>>> But I wonder if it is cleaner to not touch the struct here, and instead >>>> add a new member to kvm_x86_ops---used directly in init_kvm_softmmu like >>>> kvm_x86_ops->set_cr3. The new member can do something like >>>> >>>> if (is_guest_mode(vcpu)) { >>>> struct vmcs12 *vmcs12 = get_vmcs12(vcpu); >>>> if (vmcs12->exception_bitmap & (1u << PF_VECTOR)) { >>>> nested_vmx_vmexit(vcpu); >>>> return; >>>> } >>>> } >>>> >>>> kvm_inject_page_fault(vcpu, fault); >>> >>> I do not quite understand what you mean here. inject_page_fault() is >>> called from the depth of page table walking. How the code will not to >>> call new member in some circumstances? >> >> IIUC the new function is called if and only if is_guest_mode(vcpu) && >> !enable_ept. So what I'm suggesting is something like this: >> > Ah I see, so you propose to check for guest mode and enable_ept in the > function instead of switching to another function, but switching to > another function is how code was designed to be. You do not need to check enable_ept if I understand the code correctly, because the new function is specifically called in init_kvm_softmmu, i.e. not for nested_mmu and not for tdp_enabled. I'm asking because I didn't find any other place that modifies function pointers this way after kvm_mmu_reset_context. > Nested NPT/EPT provide > their own function too, but there is nothing that stops you from > checking on what MMU you are now in the function itself. The difference is that NPT/EPT use a completely different paging mode for nested and non-nested (non-nested uses direct mapping, nested uses shadow mapping). Shadow paging is really the same thing for nested and non-nested, you just have to do the injection the right way. >> --- a/arch/x86/include/asm/kvm_host.h >> +++ b/arch/x86/include/asm/kvm_host.h >> @@ -735,6 +735,8 @@ struct kvm_x86_ops { >> void (*adjust_tsc_offset)(struct kvm_vcpu *vcpu, s64 adjustment, bool host); >> >> void (*set_tdp_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3); >> + void (*inject_softmmu_page_fault)(struct kvm_vcpu *vcpu, >> + struct x86_exception *fault); >> >> void (*set_supported_cpuid)(u32 func, struct kvm_cpuid_entry2 *entry); >> >> --- a/arch/x86/kvm/mmu.c >> +++ b/arch/x86/kvm/mmu.c >> @@ -3805,7 +3805,7 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu) >> vcpu->arch.walk_mmu->set_cr3 = kvm_x86_ops->set_cr3; >> vcpu->arch.walk_mmu->get_cr3 = get_cr3; >> vcpu->arch.walk_mmu->get_pdptr = kvm_pdptr_read; >> - vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault; >> + vcpu->arch.walk_mmu->inject_page_fault = kvm_x86_ops->inject_softmmu_page_fault; >> >> return r; >> } >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -7499,6 +7499,20 @@ static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, >> vmcs12->guest_physical_address = fault->address; >> } >> >> +static void vmx_inject_softmmu_page_fault(struct kvm_vcpu *vcpu, >> + struct x86_exception *fault) >> +{ >> + if (is_guest_mode(vcpu)) { > is_guest_mode(vcpu) && !enable_ept You don't really need to check for enable_ept (perhaps WARN_ON(enable_ept) instead) because the function is not used always, only in init_kvm_softmmu. > I described what I saw with VMX, I am not saying the same happens with > SVM :) I just do not see why it should not and the non fatality of the > BUG can explain why it was missed. Interesting, I got something completely different. The guest just got stuck before even getting to the GRUB prompt. I'm trying your patches now... Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html