The performance of shadow paging is severely degraded in some workloads when the guest kernel is using KPTI. This is primarily due to the vastly increased number of CR3 switches that result from KPTI. This patch series implements various optimizations to reduce some of this overhead. Compared to the baseline, this results in a reduction from ~16m12s to ~4m44s for a 4-VCPU kernel compile benchmark and from ~25m5s to ~14m50s for a 1-VCPU kernel compile benchmark. Junaid Shahid (11): kvm: x86: Make sync_page() flush remote TLBs once only kvm: x86: Refactor mmu_free_roots() kvm: x86: Add fast CR3 switch code path kvm: x86: Suppress CR3_PCID_INVD bit only when PCIDs are enabled kvm: x86: Add ability to skip TLB flush when switching CR3 kvm: x86: Map guest PCIDs to host PCIDs kvm: vmx: Support INVPCID in shadow paging mode kvm: x86: Skip TLB flush on fast CR3 switch when indicated by guest kvm: x86: Add a root_hpa parameter to kvm_mmu->invlpg() kvm: x86: Skip shadow page resync on CR3 switch when indicated by guest kvm: x86: Flush only affected TLB entries in kvm_mmu_invlpg* arch/x86/include/asm/kvm_host.h | 27 ++- arch/x86/kvm/mmu.c | 311 ++++++++++++++++++++++++++------ arch/x86/kvm/mmu.h | 14 ++ arch/x86/kvm/paging_tmpl.h | 18 +- arch/x86/kvm/svm.c | 25 ++- arch/x86/kvm/vmx.c | 134 +++++++++++++- arch/x86/kvm/x86.c | 21 ++- 7 files changed, 466 insertions(+), 84 deletions(-) -- 2.17.0.441.gb46fe60e1d-goog