On Wed, Oct 21, 2020 at 04:39:28PM +0200, Vitaly Kuznetsov wrote: > Sean Christopherson <sean.j.christopherson@xxxxxxxxx> writes: > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index e0fea09a6e42..89019e6476b3 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -478,18 +478,13 @@ static int kvm_fill_hv_flush_list_func(struct hv_guest_mapping_flush_list *flush > > range->pages); > > } > > > > -static inline int hv_remote_flush_eptp(u64 eptp, struct kvm_tlb_range *range) > > +static inline int hv_remote_flush_pgd(u64 pgd, struct kvm_tlb_range *range) > > { > > - /* > > - * FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE hypercall needs address > > - * of the base of EPT PML4 table, strip off EPT configuration > > - * information. > > - */ > > if (range) > > - return hyperv_flush_guest_mapping_range(eptp & PAGE_MASK, > > + return hyperv_flush_guest_mapping_range(pgd, > > kvm_fill_hv_flush_list_func, (void *)range); > > else > > - return hyperv_flush_guest_mapping(eptp & PAGE_MASK); > > + return hyperv_flush_guest_mapping(pgd); > > } > > (I'm probably missing something, please bear with me -- this is the last > patch of the series after all :-) but PGD which comes from > kvm_mmu_load_pgd() has PCID bits encoded and you're dropping > '&PAGE_MASK' here ... ... > > @@ -564,17 +559,17 @@ static int hv_enable_direct_tlbflush(struct kvm_vcpu *vcpu) > > > > #endif /* IS_ENABLED(CONFIG_HYPERV) */ > > > > -static void hv_load_mmu_eptp(struct kvm_vcpu *vcpu, u64 eptp) > > +static void hv_load_mmu_pgd(struct kvm_vcpu *vcpu, u64 pgd) > > { > > #if IS_ENABLED(CONFIG_HYPERV) > > struct kvm_vmx *kvm_vmx = to_kvm_vmx(vcpu->kvm); > > > > if (kvm_x86_ops.tlb_remote_flush == hv_remote_flush_tlb) { > > - spin_lock(&kvm_vmx->ept_pointer_lock); > > - to_vmx(vcpu)->ept_pointer = eptp; > > - if (eptp != kvm_vmx->hv_tlb_eptp) > > - kvm_vmx->hv_tlb_eptp = INVALID_PAGE; > > - spin_unlock(&kvm_vmx->ept_pointer_lock); > > + spin_lock(&kvm_vmx->hv_tlb_pgd_lock); > > + to_vmx(vcpu)->hv_tlb_pgd = pgd; > > + if (pgd != kvm_vmx->hv_tlb_pgd) > > + kvm_vmx->hv_tlb_pgd = INVALID_PAGE; > > + spin_unlock(&kvm_vmx->hv_tlb_pgd_lock); > > } > > #endif > > } > > @@ -3059,7 +3054,7 @@ static void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd, > > eptp = construct_eptp(vcpu, pgd, pgd_level); > > vmcs_write64(EPT_POINTER, eptp); > > > > - hv_load_mmu_eptp(vcpu, eptp); > > + hv_load_mmu_pgd(vcpu, pgd); > > ... and not adding it here. (construct_eptp() seems to drop PCID bits > but add its own stuff). Is this on purpose? No, I completely forgot KVM crams the PCID bits into pgd. I'll think I'll add a patch to rework .load_mmu_pgd() to move the PCID bits to a separate param, and change construct_eptp() to do WARN_ON_ONCE(pgd & ~PAGE_MASK). Actually, I think it makes more sense to have VMX and SVM, grab the PCID via kvm_get_active_pcid(vcpu) when necessary. For EPTP, getting the PCID bits may unnecessarily read CR3 from the VMCS. Ugh, which brings up another issue. I'm pretty sure the "vmcs01.GUEST_CR3 is already up-to-date" is dead code: if (!enable_unrestricted_guest && !is_paging(vcpu)) guest_cr3 = to_kvm_vmx(kvm)->ept_identity_map_addr; else if (test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail)) guest_cr3 = vcpu->arch.cr3; else /* vmcs01.GUEST_CR3 is already up-to-date. */ update_guest_cr3 = false; vmx_ept_load_pdptrs(vcpu); The sole caller of .load_mmu_pgd() always invokes kvm_get_active_pcid(), which in turn always does kvm_read_cr3(), i.e. CR3 will always be available. So yeah, I think moving kvm_get_active_pcid() in VMX/SVM is the right approach. I'll rename "pgd" to "root_hpa" and "pgd_level" to "root_level" so that we don't end up with inconsistencies, e.g. where pgd may or may not contain PCID bits. Nice catch!