Re: KVM: x86: use kvm_set_cr3/cr4 in ioctl_set_sregs

Marcelo Tosatti <mtosatti@xxxxxxxxxx> · Thu, 16 Apr 2009 06:10:42 -0300

On Thu, Apr 16, 2009 at 11:56:15AM +0300, Avi Kivity wrote:
> Marcelo Tosatti wrote:
>> Matt T. Yourst notes that kvm_arch_vcpu_ioctl_set_sregs lacks validity
>> checking for the new cr3 value:
>>
>> "Userspace callers of KVM_SET_SREGS can pass a bogus value of cr3 to
>> the kernel. This will trigger a NULL pointer access in gfn_to_rmap()
>> when userspace next tries to call KVM_RUN on the affected VCPU and kvm
>> attempts to activate the new non-existent page table root.
>>
>> This happens since kvm only validates that cr3 points to a valid guest
>> physical memory page when code *inside* the guest sets cr3. However, kvm
>> currently trusts the userspace caller (e.g. QEMU) on the host machine to
>> always supply a valid page table root, rather than properly validating
>> it along with the rest of the reloaded guest state."
>>
>> http://sourceforge.net/tracker/?func=detail&atid=893831&aid=2687641&group_id=180599
>>
>> Follow Avi's suggestion to use kvm_set_cr3, and do the same for
>> assigment of cr4. Note kvm_set_cr4 unconditionally resets the mmu
>> context, as long as cr4 is valid.
>>
>> Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
>>
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 148cde2..89fb3c7 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -3985,25 +3985,19 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
>>  	kvm_x86_ops->set_gdt(vcpu, &dt);
>>   	vcpu->arch.cr2 = sregs->cr2;
>> -	mmu_reset_needed |= vcpu->arch.cr3 != sregs->cr3;
>> -	vcpu->arch.cr3 = sregs->cr3;
>>  +	kvm_set_cr3(vcpu, sregs->cr3);
>>  	kvm_set_cr8(vcpu, sregs->cr8);
>>   	mmu_reset_needed |= vcpu->arch.shadow_efer != sregs->efer;
>>  	kvm_x86_ops->set_efer(vcpu, sregs->efer);
>>  	kvm_set_apic_base(vcpu, sregs->apic_base);
>>  -	kvm_x86_ops->decache_cr4_guest_bits(vcpu);
>> -
>>  	mmu_reset_needed |= vcpu->arch.cr0 != sregs->cr0;
>>  	kvm_x86_ops->set_cr0(vcpu, sregs->cr0);
>>  	vcpu->arch.cr0 = sregs->cr0;
>>  -	mmu_reset_needed |= vcpu->arch.cr4 != sregs->cr4;
>> -	kvm_x86_ops->set_cr4(vcpu, sregs->cr4);
>> -	if (!is_long_mode(vcpu) && is_pae(vcpu))
>> -		load_pdptrs(vcpu, vcpu->arch.cr3);
>> +	kvm_set_cr4(vcpu, sregs->cr4);
>>   	if (mmu_reset_needed)
>>  		kvm_mmu_reset_context(vcpu);
>>   
>
> Consider the following:
>
> current state:
>  cr3 = 0
>  cr4.pae = 0
>
> new state:
>  cr3 = 0x800
>  cr4.pae = 1
>
> When you call kvm_set_cr3(), it will inject a #GP into the guest because  
> we are setting bit 11 when cr4.pae=0, which is illegal.  However the new  
> cr4.pae=1, so the new state was in fact legal!
>
> There are a few ways out, one is to first go back to real mode and set  
> eveything up carefully in the right order (including EFER.LMA and  
> EFER.LME, and CS.L).  The other is to refactor kvm_set_* so that we have  
> internal setters which won't trigger these faults (but do need to check  
> at the end that the state is legal).
>
> This first method is probably better since that's what the guest does  
> when booting anyway.

Humpf. And something like this? Or GP# instead of triple fault?

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 148cde2..3e63bac 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3986,7 +3986,10 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 
 	vcpu->arch.cr2 = sregs->cr2;
 	mmu_reset_needed |= vcpu->arch.cr3 != sregs->cr3;
-	vcpu->arch.cr3 = sregs->cr3;
+	if (gfn_to_memslot(vcpu->kvm, sregs->cr3 >> PAGE_SHIFT))
+		vcpu->arch.cr3 = sregs->cr3;
+	else
+		set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests);
 
 	kvm_set_cr8(vcpu, sregs->cr8);
 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html