On Fri, 2023-03-03 at 14:21 +0800, Chao Gao wrote: > On Mon, Feb 27, 2023 at 04:45:45PM +0800, Robert Hoo wrote: > > LAM feature uses 2 high bits in CR3 (bit 62 for LAM_U48 and bit 61 > > for > > LAM_U57) to enable/config LAM feature for user mode addresses. The > > LAM > > masking is done before legacy canonical checks. > > > > To virtualize LAM CR3 bits usage, this patch: > > 1. Don't reserve these 2 bits when LAM is enable on the vCPU. > > Previously > > when validate CR3, kvm uses kvm_vcpu_is_legal_gpa(), now define > > kvm_vcpu_is_valid_cr3() which is actually kvm_vcpu_is_legal_gpa() > > + CR3.LAM bits validation. Substitutes > > kvm_vcpu_is_legal/illegal_gpa() > > with kvm_vcpu_is_valid_cr3() in call sites where is validating CR3 > > rather > > than pure GPA. > > 2. mmu::get_guest_pgd(), its implementation is get_cr3() which > > returns > > whole guest CR3 value. Strip LAM bits in those call sites that need > > pure > > PGD value, e.g. mmu_alloc_shadow_roots(), > > FNAME(walk_addr_generic)(). > > 3. When form a new guest CR3 (vmx_load_mmu_pgd()), melt in LAM bit > > (kvm_get_active_lam()). > > 4. When guest sets CR3, identify ONLY-LAM-bits-toggling cases, > > where it is > > unnecessary to make new pgd, but just make request of load pgd, > > then new > > CR3.LAM bits configuration will be melt in (above point 3). To be > > conservative, this case still do TLB flush. > > 5. For nested VM entry, allow the 2 CR3 bits set in corresponding > > VMCS host/guest fields. > > isn't this already covered by item #1 above? Ah, it is to address your comments on last version. To repeat/emphasize again, doesn't harm, does it?;) > (...) > > > > +static inline u64 kvm_get_active_lam(struct kvm_vcpu *vcpu) > > +{ > > + return kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | > > X86_CR3_LAM_U57); > > +} > > I think it is better to define a mask (like reserved_gpa_bits): > > kvm_vcpu_arch { > ... > > /* > * Bits in CR3 used to enable certain features. These bits > don't > * participate in page table walking. They should be masked to > * get the base address of page table. When shadow paging is > * used, these bits should be kept as is in the shadow CR3. > */ > u64 cr3_control_bits; > I don't strongly object this. But per SDM, CR3.bit[63:MAXPHYADDR] are reserved; and MAXPHYADDR is at most 52 [1]. So can we assert and simply define the MASK bit[63:52]? (I did this in v3 and prior) [1] CPUID.80000008H:EAX[7:0] reports the physical-address width supported by the processor. (For processors that do not support CPUID function 80000008H, the width is generally 36 if CPUID.01H:EDX.PAE [bit 6] = 1 and 32 otherwise.) This width is referred to as MAXPHYADDR. MAXPHYADDR is at most 52. (SDM 4.1.4 Enumeration of Paging Features by CPUID) > and initialize the mask in kvm_vcpu_after_set_cpuid(): > > if (guest_cpuid_has(X86_FEATURE_LAM)) > vcpu->arch.cr3_control_bits = X86_CR3_LAM_U48 | > X86_CR3_LAM_U57; > > then add helpers to extract/mask control bits. > > It is cleaner and can avoid looking up guest CPUID at runtime. > And if > AMD has a similar feature (e.g., some CR3 bits are used as control > bits), > it is easy to support that feature.