On Thu, Jul 19, 2018 at 12:44:53PM -0700, Jim Mattson wrote: > If we're using nested EPT, why not do away with this allocation > altogether, and just use the vpid from vmcs12? The TLB is tagged by > {PCID, EP4TA, VPID}, and the shadow EP4TA will be different from any > L1 EP4TA. Hmm, I'm wondering why the original commit brought any improvement then? The commit message says > VPID is used to tag address space and avoid a TLB flush. Currently L0 use > the same VPID to run L1 and all its guests. KVM flushes VPID when switching > between L1 and L2. > > This patch advertises VPID to the L1 hypervisor, then address space of L1 > and L2 can be separately treated and avoid TLB flush when swithing between > L1 and L2. For each nested vmentry, if vpid12 is changed, reuse shadow vpid > w/ an invvpid. but my understanding of your comment is that even if L1 and L2 share the vpid, flushing one doesn't flush the other, so all of the performance gain of that commit comes from skipping the flush when entering the same L2 vcpu and/or advertising vpid support in L1 hypervisor. Is this correct? Thanks, Roman. > On Thu, Jul 19, 2018 at 11:59 AM, Roman Kagan <rkagan@xxxxxxxxxxxxx> wrote: > > VPID for the nested vcpu is allocated at vmx_create_vcpu whenever nested > > vmx is turned on with the module parameter. > > > > However, it's only freed if the L1 guest has executed VMXON which is not > > a given. > > > > As a result, on a system with nested==on every creation+deletion of an > > L1 vcpu without running an L2 guest results in leaking one vpid. Since > > the total number of vpids is limited to 64k, they can eventually get > > exhausted, preventing L2 from starting. > > > > Delay allocation of the L2 vpid until VMXON emulation, thus matching its > > freeing. > > > > Fixes: 5c614b3583e7b6dab0c86356fa36c2bcbb8322a0 > > Cc: stable@xxxxxxxxxxxxxxx > > Signed-off-by: Roman Kagan <rkagan@xxxxxxxxxxxxx> > > --- > > arch/x86/kvm/vmx.c | 7 +++---- > > 1 file changed, 3 insertions(+), 4 deletions(-) > > > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > > index e30da9a2430c..fcb33ba15bb9 100644 > > --- a/arch/x86/kvm/vmx.c > > +++ b/arch/x86/kvm/vmx.c > > @@ -7893,6 +7893,8 @@ static int enter_vmx_operation(struct kvm_vcpu *vcpu) > > HRTIMER_MODE_REL_PINNED); > > vmx->nested.preemption_timer.function = vmx_preemption_timer_fn; > > > > + vmx->nested.vpid02 = allocate_vpid(); > > + > > vmx->nested.vmxon = true; > > return 0; > > > > @@ -10370,11 +10372,9 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) > > goto free_vmcs; > > } > > > > - if (nested) { > > + if (nested) > > nested_vmx_setup_ctls_msrs(&vmx->nested.msrs, > > kvm_vcpu_apicv_active(&vmx->vcpu)); > > - vmx->nested.vpid02 = allocate_vpid(); > > - } > > > > vmx->nested.posted_intr_nv = -1; > > vmx->nested.current_vmptr = -1ull; > > @@ -10391,7 +10391,6 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) > > return &vmx->vcpu; > > > > free_vmcs: > > - free_vpid(vmx->nested.vpid02); > > free_loaded_vmcs(vmx->loaded_vmcs); > > free_msrs: > > kfree(vmx->guest_msrs); > > -- > > 2.17.1 > >