Re: [PATCH v5 3/5] KVM: x86: Virtualize CR3.LAM_{U48,U57}

Chao Gao <chao.gao@xxxxxxxxx> · Fri, 3 Mar 2023 23:53:18 +0800

On Fri, Mar 03, 2023 at 10:23:50PM +0800, Robert Hoo wrote:
>On Fri, 2023-03-03 at 14:21 +0800, Chao Gao wrote:
>> On Mon, Feb 27, 2023 at 04:45:45PM +0800, Robert Hoo wrote:
>> > LAM feature uses 2 high bits in CR3 (bit 62 for LAM_U48 and bit 61
>> > for
>> > LAM_U57) to enable/config LAM feature for user mode addresses. The
>> > LAM
>> > masking is done before legacy canonical checks.
>> > 
>> > To virtualize LAM CR3 bits usage, this patch:
>> > 1. Don't reserve these 2 bits when LAM is enable on the vCPU.
>> > Previously
>> > when validate CR3, kvm uses kvm_vcpu_is_legal_gpa(), now define
>> > kvm_vcpu_is_valid_cr3() which is actually kvm_vcpu_is_legal_gpa()
>> > + CR3.LAM bits validation. Substitutes
>> > kvm_vcpu_is_legal/illegal_gpa()
>> > with kvm_vcpu_is_valid_cr3() in call sites where is validating CR3
>> > rather
>> > than pure GPA.
>> > 2. mmu::get_guest_pgd(), its implementation is get_cr3() which
>> > returns
>> > whole guest CR3 value. Strip LAM bits in those call sites that need
>> > pure
>> > PGD value, e.g. mmu_alloc_shadow_roots(),
>> > FNAME(walk_addr_generic)().
>> > 3. When form a new guest CR3 (vmx_load_mmu_pgd()), melt in LAM bit
>> > (kvm_get_active_lam()).
>> > 4. When guest sets CR3, identify ONLY-LAM-bits-toggling cases,
>> > where it is
>> > unnecessary to make new pgd, but just make request of load pgd,
>> > then new
>> > CR3.LAM bits configuration will be melt in (above point 3). To be
>> > conservative, this case still do TLB flush.
>> > 5. For nested VM entry, allow the 2 CR3 bits set in corresponding
>> > VMCS host/guest fields.
>> 
>> isn't this already covered by item #1 above?
>
>Ah, it is to address your comments on last version. To repeat/emphasize
>again, doesn't harm, does it?;) 

It is confusing. Trying to merge #5 to #1:

If LAM is supported, bits 62:61 (LAM_U48 and LAM_U57) are not reserved
in CR3. VM entry also allows the two bits to be set in CR3 field in
guest-state and host-state area of the VMCS. Previously ...

>> 
>(...)
>> > 
>> > +static inline u64 kvm_get_active_lam(struct kvm_vcpu *vcpu)
>> > +{
>> > +	return kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 |
>> > X86_CR3_LAM_U57);
>> > +}
>> 
>> I think it is better to define a mask (like reserved_gpa_bits):
>> 
>> kvm_vcpu_arch {
>> 	...
>> 
>> 	/*
>> 	 * Bits in CR3 used to enable certain features. These bits
>> don't
>> 	 * participate in page table walking. They should be masked to
>> 	 * get the base address of page table. When shadow paging is
>> 	 * used, these bits should be kept as is in the shadow CR3.
>> 	 */
>> 	u64 cr3_control_bits;
>> 
>
>I don't strongly object this. But per SDM, CR3.bit[63:MAXPHYADDR] are
>reserved; and MAXPHYADDR is at most 52 [1]. So can we assert and simply
>define the MASK bit[63:52]? (I did this in v3 and prior)

No. Setting any bit in 60:52 should be rejected. And setting bit 62 or
61 should be allowed if LAM is supported by the vCPU. I don't see how
your proposal can distinguish these two cases.