On 10/10/2017 09:01, Haozhong Zhang wrote: > When KVM emulates an exit from L2 to L1, it loads L1 CR4 into the > guest CR4. Before this CR4 loading, the guest CR4 refers to L2 > CR4. Because these two CR4's are in different levels of guest, we > should vmx_set_cr4() rather than kvm_set_cr4() here. The latter, which > is used to handle guest writes to its CR4, checks the guest change to > CR4 and may fail if the change is invalid. > > The failure may cause trouble. Consider we start > a L1 guest with non-zero L1 PCID in use, > (i.e. L1 CR4.PCIDE == 1 && L1 CR3.PCID != 0) > and > a L2 guest with L2 PCID disabled, > (i.e. L2 CR4.PCIDE == 0) > and following events may happen: > > 1. If kvm_set_cr4() is used in load_vmcs12_host_state() to load L1 CR4 > into guest CR4 (in VMCS01) for L2 to L1 exit, it will fail because > of PCID check. As a result, the guest CR4 recorded in L0 KVM (i.e. > vcpu->arch.cr4) is left to the value of L2 CR4. > > 2. Later, if L1 attempts to change its CR4, e.g., clearing VMXE bit, > kvm_set_cr4() in L0 KVM will think L1 also wants to enable PCID, > because the wrong L2 CR4 is used by L0 KVM as L1 CR4. As L1 > CR3.PCID != 0, L0 KVM will inject GP to L1 guest. > > Fixes: 4704d0befb072 ("KVM: nVMX: Exiting from L2 to L1") > Signed-off-by: Haozhong Zhang <haozhong.zhang@xxxxxxxxx> Makes sense, and none of the other effects of kvm_set_cr4 matter: // MMU is already reset on L2->L1 exit if (((cr4 ^ old_cr4) & pdptr_bits) || (!(cr4 & X86_CR4_PCIDE) && (old_cr4 & X86_CR4_PCIDE))) kvm_mmu_reset_context(vcpu); // L2 CPUID always exits to L1 if ((cr4 ^ old_cr4) & (X86_CR4_OSXSAVE | X86_CR4_PKE)) kvm_update_cpuid(vcpu); Would you please write a testcase too? Paolo > --- > arch/x86/kvm/vmx.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index b9d2140eb212..3fe639e0e1f6 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -11297,7 +11297,7 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu, > > /* Same as above - no reason to call set_cr4_guest_host_mask(). */ > vcpu->arch.cr4_guest_owned_bits = ~vmcs_readl(CR4_GUEST_HOST_MASK); > - kvm_set_cr4(vcpu, vmcs12->host_cr4); > + vmx_set_cr4(vcpu, vmcs12->host_cr4); > > nested_ept_uninit_mmu_context(vcpu); > >