Re: [PATCH v2 0/5] Add support for EPT execute only for nested hypervisors

Bandan Das <bsd@xxxxxxxxxx> · Wed, 13 Jul 2016 11:47:22 -0400

Paolo Bonzini <pbonzini@xxxxxxxxxx> writes:

> On 13/07/2016 17:06, Bandan Das wrote:
>>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>>> index 190c0559c221..bd2535fdb9eb 100644
>>> --- a/arch/x86/kvm/mmu.c
>>> +++ b/arch/x86/kvm/mmu.c
>>> @@ -2524,11 +2524,10 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>>>  		return 0;
>>>  
>>>  	/*
>>> -	 * In the non-EPT case, execonly is not valid and so
>>> -	 * the following line is equivalent to spte |= PT_PRESENT_MASK.
>>>  	 * For the EPT case, shadow_present_mask is 0 if hardware
>>> -	 * supports it and we honor whatever way the guest set it.
>>> -	 * See: FNAME(gpte_access) in paging_tmpl.h
>>> +	 * supports exec-only page table entries.  In that case,
>>> +	 * ACC_USER_MASK and shadow_user_mask are used to represent
>>> +	 * read access.  See FNAME(gpte_access) in paging_tmpl.h.
>>>  	 */
>> 
>> I would still prefer a note about the non-EPT case, makes it easy to
>> understand.
>
> I can add "shadow_present_mask is PT_PRESENT_MASK in the non-EPT case"
> but it's a bit of a tautology.

shadow_present_mask actually signifies different things for ept/non-ept
cases and it doesn't hurt to mention it. But I get your point, maybe,
it's self-explanatory.

>>>  	spte |= shadow_present_mask;
>>>  	if (!speculative)
>>> @@ -3923,9 +3922,6 @@ static void update_permission_bitmask(struct kvm_vcpu *vcpu,
>>>  				 *   clearer.
>>>  				 */
>>>  				smap = cr4_smap && u && !uf && !ff;
>>> -			} else {
>>> -				if (shadow_present_mask)
>>> -					u = 1;
>>>  			}
>>>  
>>>  			fault = (ff && !x) || (uf && !u) || (wf && !w) ||
>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>> index 576c47cda1a3..dfef081e76c0 100644
>>> --- a/arch/x86/kvm/vmx.c
>>> +++ b/arch/x86/kvm/vmx.c
>>> @@ -6120,12 +6120,14 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu)
>>>  	gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS);
>>>  	trace_kvm_page_fault(gpa, exit_qualification);
>>>  
>>> -	/* It is a write fault? */
>>> +	/* it is a read fault? */
>>> +	error_code = (exit_qualification << 2) & PFERR_USER_MASK;
>>> +	/* it is a write fault? */
>>>  	error_code = exit_qualification & PFERR_WRITE_MASK;
>>>  	/* It is a fetch fault? */
>>>  	error_code |= (exit_qualification << 2) & PFERR_FETCH_MASK;
>>>  	/* ept page table is present? */
>>> -	error_code |= (exit_qualification >> 3) & PFERR_PRESENT_MASK;
>>> +	error_code |= (exit_qualification & 0x38) != 0;
>>>
>> 
>> Thank you for the thorough review here. I missed that we didn't set the read bit
>> at all. I am still a little unclear how permission_fault works though...
>> 
>>>  	vcpu->arch.exit_qualification = exit_qualification;
>>>  
>>> @@ -6474,8 +6476,7 @@ static __init int hardware_setup(void)
>>>  			(enable_ept_ad_bits) ? VMX_EPT_DIRTY_BIT : 0ull,
>>>  			0ull, VMX_EPT_EXECUTABLE_MASK,
>>>  			cpu_has_vmx_ept_execute_only() ?
>>> -				      0ull : PT_PRESENT_MASK);
>>> -		BUILD_BUG_ON(PT_PRESENT_MASK != VMX_EPT_READABLE_MASK);
>>> +				      0ull : VMX_EPT_READABLE_MASK);
>> 
>> I wanted to keep it the former way because "PT_PRESENT_MASK is equal to VMX_EPT_READABLE_MASK"
>> is an assumption all throughout. I wanted to use this section to catch mismatches.
>
> I think there's no such assumption anymore, actually.  Can you double
> check?  If there are any, that's where the BUILD_BUG_ON should be.

What I meant is how they are the same bit.  is_shadow_present_pte() is probably one
and another one is link_shadow_page() which already has a BUILD_BUG_ON().

> Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html