Re: [PATCH v3 00/18] Shadow Paging performance improvements

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27/06/2018 23:59, Junaid Shahid wrote:
> Changes since v2:
> - CR3_PCID_INVD is replaced by X86_CR3_PCID_NOFLUSH 
> - kvm_mmu_calc_root_page_role() and friends are no longer public
> - Simplified the race condition example in mmu_need_write_protect()
> - Added smp_load_acquire()s in kvm_mmu_sync_roots()
> - Ignored non-canonical addresses in vmx_flush_tlb_gva()
> - A couple of minor cleanups
> 
> Changes since v1:
> - Renamed the flags returned by set_spte
> - Split up a couple of changes into separate patches and refactored some
>   other patches
> - .set_cr3() handlers never flush TLB rather than taking that as parameter
> - Generalized lockless CR3 switching to work acroos different MMU modes
> - Implemented lockless CR3/EPTP switching for nested VMX L1<->L2 switches
> - Added an LRU cache containing multiple fast-switchable roots instead
>   of limiting it to only the immediately previous one.
> 
> The performance of shadow paging is severely degraded in some workloads
> when the guest kernel is using KPTI. This is primarily due to the vastly
> increased number of CR3 switches that result from KPTI.
> 
> This patch series implements various optimizations to reduce some of this
> overhead. Compared to the baseline, this results in a reduction from
> ~16m12s to ~4m44s for a 4-VCPU kernel compile benchmark and from ~25m5s to
> ~14m50s for a 1-VCPU kernel compile benchmark.

Great!  The CPUID microbenchmark with 16 vCPUs is down by about 25%.
The remaining overhead comes from the get_user_pages calls in
nested_get_vmcs12_pages, which contend on pmd_lock.

We should be able to cache those using the MMU notifier, similar to how
the APIC access page is already handled.

Thanks,

Paolo

> Junaid Shahid (18):
>   kvm: x86: Make sync_page() flush remote TLBs once only
>   kvm: x86: Avoid taking MMU lock in kvm_mmu_sync_roots if no sync is
>     needed
>   kvm: x86: Add fast CR3 switch code path
>   kvm: x86: Introduce kvm_mmu_calc_root_page_role()
>   kvm: x86: Introduce KVM_REQ_LOAD_CR3
>   kvm: x86: Add support for fast CR3 switch across different MMU modes
>   kvm: x86: Support resetting the MMU context without resetting roots
>   kvm: x86: Use fast CR3 switch for nested VMX
>   kvm: x86: Add ability to skip TLB flush when switching CR3
>   kvm: x86: Propagate guest PCIDs to host PCIDs
>   kvm: vmx: Support INVPCID in shadow paging mode
>   kvm: x86: Skip TLB flush on fast CR3 switch when indicated by guest
>   kvm: x86: Add a root_hpa parameter to kvm_mmu->invlpg()
>   kvm: x86: Support selectively freeing either current or previous MMU
>     root
>   kvm: x86: Skip shadow page resync on CR3 switch when indicated by
>     guest
>   kvm: x86: Flush only affected TLB entries in kvm_mmu_invlpg*
>   kvm: x86: Add multi-entry LRU cache for previous CR3s
>   kvm: x86: Remove CR3_PCID_INVD flag
> 
>  arch/x86/include/asm/kvm_host.h |  32 ++-
>  arch/x86/kvm/emulate.c          |   2 +-
>  arch/x86/kvm/mmu.c              | 468 ++++++++++++++++++++++++++------
>  arch/x86/kvm/mmu.h              |  24 +-
>  arch/x86/kvm/paging_tmpl.h      |  18 +-
>  arch/x86/kvm/svm.c              |  12 +-
>  arch/x86/kvm/vmx.c              | 154 ++++++++++-
>  arch/x86/kvm/x86.c              |  18 +-
>  virt/kvm/kvm_main.c             |  14 +-
>  9 files changed, 630 insertions(+), 112 deletions(-)
> 




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux