[PATCH] x86/kvm/mmu: make mmu->prev_roots cache work for NPT case

Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> · Fri, 22 Feb 2019 17:46:16 +0100

I noticed that fast_cr3_switch() always fails when we switch back from L2
to L1 as it is not able to find a cached root. This is odd: host's CR3
usually stays the same, we expect to always follow the fast path. Turns
out the problem is that page role is always mismatched because
kvm_mmu_get_page() filters out cr4_pae when direct, the value is stored
in page header and later compared with new_role in cached_root_available().
As cr4_pae is always set in long mode prev_roots cache is dysfunctional.

The problem appeared after we introduced kvm_calc_mmu_role_common():
previously, we were only setting this bit for shadow MMU root but now
we set it for everything. Restore the original behavior.

Fixes: 7dcd57552008 ("x86/kvm/mmu: check if tdp/shadow MMU reconfiguration is needed")
Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
---
RFC:
Alternatively, I can suggest two solutions:
- Do not clear cr4_pae in kvm_mmu_get_page() and check direct on call sites
 (detect_write_misaligned(), get_written_sptes()).
- Filter cr4_pae out when direct in kvm_mmu_new_cr3().
The code in kvm_mmu_get_page() is very ancient, I'm afraid to touch it :=)
---
 arch/x86/kvm/mmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index f2d1d230d5b8..c729e98eee49 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4791,7 +4791,6 @@ static union kvm_mmu_role kvm_calc_mmu_role_common(struct kvm_vcpu *vcpu,
 
 	role.base.access = ACC_ALL;
 	role.base.nxe = !!is_nx(vcpu);
-	role.base.cr4_pae = !!is_pae(vcpu);
 	role.base.cr0_wp = is_write_protection(vcpu);
 	role.base.smm = is_smm(vcpu);
 	role.base.guest_mode = is_guest_mode(vcpu);
@@ -4871,6 +4870,8 @@ kvm_calc_shadow_mmu_root_page_role(struct kvm_vcpu *vcpu, bool base_only)
 {
 	union kvm_mmu_role role = kvm_calc_mmu_role_common(vcpu, base_only);
 
+	role.base.cr4_pae = !!is_pae(vcpu);
+
 	role.base.smep_andnot_wp = role.ext.cr4_smep &&
 		!is_write_protection(vcpu);
 	role.base.smap_andnot_wp = role.ext.cr4_smap &&
-- 
2.20.1