Re: [PATCH 4/4] KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2

Gleb Natapov <gleb@xxxxxxxxxx> · Wed, 25 Sep 2013 16:36:42 +0300



On Wed, Sep 25, 2013 at 03:26:56PM +0200, Paolo Bonzini wrote:
> Il 25/09/2013 14:21, Gleb Natapov ha scritto:
> > On Wed, Sep 25, 2013 at 02:08:09PM +0200, Paolo Bonzini wrote:
> >> Il 25/09/2013 13:51, Gleb Natapov ha scritto:
> >>> On Wed, Sep 25, 2013 at 01:24:49PM +0200, Paolo Bonzini wrote:
> >>>> Il 25/09/2013 11:51, Gleb Natapov ha scritto:
> >>>>> @@ -7773,6 +7787,9 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
> >>>>>  	kvm_set_cr3(vcpu, vmcs12->guest_cr3);
> >>>>>  	kvm_mmu_reset_context(vcpu);
> >>>>>  
> >>>>> +	if (!enable_ept)
> >>>>> +		vcpu->arch.walk_mmu->inject_page_fault = vmx_inject_page_fault_nested;
> >>>>> +
> >>>>>  	/*
> >>>>>  	 * L1 may access the L2's PDPTR, so save them to construct vmcs12
> >>>>>  	 */
> >>>>> @@ -8232,6 +8249,9 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
> >>>>>  	kvm_set_cr3(vcpu, vmcs12->host_cr3);
> >>>>>  	kvm_mmu_reset_context(vcpu);
> >>>>>  
> >>>>> +	if (!enable_ept)
> >>>>> +		vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
> >>>>
> >>>> This is strictly speaking not needed, because kvm_mmu_reset_context
> >>>> takes care of it.
> >>>>
> >>> Yeah, but better make it explicit, it does not hurt but make it more
> >>> clear what is going on. Or at least add comment above
> >>> kvm_mmu_reset_context() about this side effect.
> >>
> >> Yes, I agree the code is cleaner like you wrote it.
> >>
> >>>> But I wonder if it is cleaner to not touch the struct here, and instead
> >>>> add a new member to kvm_x86_ops---used directly in init_kvm_softmmu like
> >>>> kvm_x86_ops->set_cr3.  The new member can do something like
> >>>>
> >>>> 	if (is_guest_mode(vcpu)) {
> >>>> 		struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> >>>> 		if (vmcs12->exception_bitmap & (1u << PF_VECTOR)) {
> >>>> 			nested_vmx_vmexit(vcpu);
> >>>> 			return;
> >>>> 		}
> >>>> 	}
> >>>>
> >>>> 	kvm_inject_page_fault(vcpu, fault);
> >>>
> >>> I do not quite understand what you mean here. inject_page_fault() is
> >>> called from the depth of page table walking. How the code will not to
> >>> call new member in some circumstances?
> >>
> >> IIUC the new function is called if and only if is_guest_mode(vcpu) && 
> >> !enable_ept.  So what I'm suggesting is something like this:
> >>
> > Ah I see, so you propose to check for guest mode and enable_ept in the
> > function instead of switching to another function, but switching to
> > another function is how code was designed to be.
> 
> You do not need to check enable_ept if I understand the code correctly,
> because the new function is specifically called in init_kvm_softmmu,
> i.e. not for nested_mmu and not for tdp_enabled.
> 
Ah true. With EPT shadow page code will not run so function will not be
called.

> I'm asking because I didn't find any other place that modifies function
> pointers this way after kvm_mmu_reset_context.
> 
nested_ept_init_mmu_context() modify the pointer. Not after
kvm_mmu_reset_context() but it does not matter much. The canonical way
to do what I did here would be to create special mmu context to handle
shadow on shadow, bit the only difference between it and regular shadow
one would be this pointer, so it looked like overkill. Just changing the
pointer do the trick.

> > Nested NPT/EPT provide
> > their own function too, but there is nothing that stops you from
> > checking on what MMU you are now in the function itself.
> 
> The difference is that NPT/EPT use a completely different paging mode
> for nested and non-nested (non-nested uses direct mapping, nested uses
> shadow mapping).  Shadow paging is really the same thing for nested and
> non-nested, you just have to do the injection the right way.
> 
> >> --- a/arch/x86/include/asm/kvm_host.h
> >> +++ b/arch/x86/include/asm/kvm_host.h
> >> @@ -735,6 +735,8 @@ struct kvm_x86_ops {
> >>  	void (*adjust_tsc_offset)(struct kvm_vcpu *vcpu, s64 adjustment, bool host);
> >>  
> >>  	void (*set_tdp_cr3)(struct kvm_vcpu *vcpu, unsigned long cr3);
> >> +	void (*inject_softmmu_page_fault)(struct kvm_vcpu *vcpu,
> >> +					  struct x86_exception *fault);
> >>  
> >>  	void (*set_supported_cpuid)(u32 func, struct kvm_cpuid_entry2 *entry);
> >>  
> >> --- a/arch/x86/kvm/mmu.c
> >> +++ b/arch/x86/kvm/mmu.c
> >> @@ -3805,7 +3805,7 @@ static int init_kvm_softmmu(struct kvm_vcpu *vcpu)
> >>  	vcpu->arch.walk_mmu->set_cr3           = kvm_x86_ops->set_cr3;
> >>  	vcpu->arch.walk_mmu->get_cr3           = get_cr3;
> >>  	vcpu->arch.walk_mmu->get_pdptr         = kvm_pdptr_read;
> >> -	vcpu->arch.walk_mmu->inject_page_fault = kvm_inject_page_fault;
> >> +	vcpu->arch.walk_mmu->inject_page_fault = kvm_x86_ops->inject_softmmu_page_fault;
> >>  
> >>  	return r;
> >>  }
> >> --- a/arch/x86/kvm/vmx.c
> >> +++ b/arch/x86/kvm/vmx.c
> >> @@ -7499,6 +7499,20 @@ static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu,
> >>  	vmcs12->guest_physical_address = fault->address;
> >>  }
> >>  
> >> +static void vmx_inject_softmmu_page_fault(struct kvm_vcpu *vcpu,
> >> +		struct x86_exception *fault)
> >> +{
> >> +	if (is_guest_mode(vcpu)) {
> > is_guest_mode(vcpu) && !enable_ept
> 
> You don't really need to check for enable_ept (perhaps
> WARN_ON(enable_ept) instead) because the function is not used always,
> only in init_kvm_softmmu.
> 
> > I described what I saw with VMX, I am not saying the same happens with
> > SVM :) I just do not see why it should not and the non fatality of the
> > BUG can explain why it was missed.
> 
> Interesting, I got something completely different.  The guest just got
> stuck before even getting to the GRUB prompt.  I'm trying your patches
> now...
> 
That's what patch 2 suppose to fix.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html