----- sean.j.christopherson@xxxxxxxxx wrote: > On Wed, 2018-05-23 at 00:59 -0700, Liran Alon wrote: > > ----- liran.alon@xxxxxxxxxx wrote: > > > > > > > > ----- jmattson@xxxxxxxxxx wrote: > > > > > > > > > > > While we're on the subject, is there any need for L0 to allocate > a > > > > vpid02 in the common case, where nested EPT is enabled? > > > > > > > > Per section 28.3.2 of the SDM, volume 3, when EPT is enabled, > > > > combined > > > > mappings in the TLB are tagged by {VPID, PCID, EP4TA}. With > nested > > > > EPT, vmcs02 and vmcs01 do not share an EP4TA. Therefore, I think > it > > > > suffices to simply copy the VPID field from vmcs12 to vmcs02 in > > > this > > > > > > > > case. > > > Good point. I agree. > > > This will trivially allow physical CPU to save multiple TLB > entries > > > populated by L2 with same EP4TA but different VPIDs. > > > > > > I do think however that this should be done on a separate patch > series > > > on top of this one. > > > I will check if I can easily create that series of patches. > > > > > > Thanks, > > > -Liran > > After some initial investigation, it seems current TLB management in > KVM is worse than I thought. > > > > By looking at vmx_set_cr3() (which is the only place which write to > VMCS EPT_POINTER), > > it seems that every load of a new EPT pointer will > vmx_flush_tlb(vcpu, true); > > In case of running with enable_ept, this will flush all TLB entries > tagged with new loaded EPTP. > > > > This means that on nVMX scenario where vmcs12 uses EPT, the TLB > effectively gets flushed > > every time you switch between L1 and L2... > > > > In addition, even in non-nVMX scenarios, in the CPU over-commit > case, if a physical CPU switches > > between running a vCPU of one VM to a vCPU of another VM, it will > keep flushing TLB entries of both VMs > > even though they are tagged with separate EPTP. > > nVMX aside, KVM's overarching design is to load a new MMU root, > i.e. EPTP, only when necessary. Switching VMCSes should not > invoke vmx_set_cr3() regardless of what prompted the VMCS switch, > e.g. kvm_mmu_reload() only invokes set_cr3() if the MMU root is > invalid, and vmx_vcpu_put() doesn't unload the MMU. Yeah you are right for the non-nVMX case. I mistakenly overlooked at that part of my description. > > As for nVMX, both nested entry and exit explicitly reset the MMU > via nested_vmx_load_cr3(), and nested entry also unloads the MMU > when nested EPT is active, via nested_ept_init_mmu_context(). > > Unloading the MMU on nested entry/exit doesn't seem deliberate, > e.g. why bother with VPID handling in prepare_vmcs02() if KVM > intends to unconditionally flush? I think figuring out how to > avoid unloading the MMU in those cases will resolve the issue > of the TLB being flushed on every switch between L1 and L2, > though I get the feeling that that will mean doing a holistic > analysis of the (nested) MMU handling. Yes. That is exactly my thoughts and what I planned to do. I will create a series that will attempt to handle this. > > > Due to the above, I think I will create a series that will first fix > this issue and then perform > > the optimization suggested by Jim here. > > > > Regards, > > -Liran