On Thu, Apr 16, 2009 at 11:56:15AM +0300, Avi Kivity wrote: > Marcelo Tosatti wrote: >> Matt T. Yourst notes that kvm_arch_vcpu_ioctl_set_sregs lacks validity >> checking for the new cr3 value: >> >> "Userspace callers of KVM_SET_SREGS can pass a bogus value of cr3 to >> the kernel. This will trigger a NULL pointer access in gfn_to_rmap() >> when userspace next tries to call KVM_RUN on the affected VCPU and kvm >> attempts to activate the new non-existent page table root. >> >> This happens since kvm only validates that cr3 points to a valid guest >> physical memory page when code *inside* the guest sets cr3. However, kvm >> currently trusts the userspace caller (e.g. QEMU) on the host machine to >> always supply a valid page table root, rather than properly validating >> it along with the rest of the reloaded guest state." >> >> http://sourceforge.net/tracker/?func=detail&atid=893831&aid=2687641&group_id=180599 >> >> Follow Avi's suggestion to use kvm_set_cr3, and do the same for >> assigment of cr4. Note kvm_set_cr4 unconditionally resets the mmu >> context, as long as cr4 is valid. >> >> Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx> >> >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 148cde2..89fb3c7 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -3985,25 +3985,19 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, >> kvm_x86_ops->set_gdt(vcpu, &dt); >> vcpu->arch.cr2 = sregs->cr2; >> - mmu_reset_needed |= vcpu->arch.cr3 != sregs->cr3; >> - vcpu->arch.cr3 = sregs->cr3; >> + kvm_set_cr3(vcpu, sregs->cr3); >> kvm_set_cr8(vcpu, sregs->cr8); >> mmu_reset_needed |= vcpu->arch.shadow_efer != sregs->efer; >> kvm_x86_ops->set_efer(vcpu, sregs->efer); >> kvm_set_apic_base(vcpu, sregs->apic_base); >> - kvm_x86_ops->decache_cr4_guest_bits(vcpu); >> - >> mmu_reset_needed |= vcpu->arch.cr0 != sregs->cr0; >> kvm_x86_ops->set_cr0(vcpu, sregs->cr0); >> vcpu->arch.cr0 = sregs->cr0; >> - mmu_reset_needed |= vcpu->arch.cr4 != sregs->cr4; >> - kvm_x86_ops->set_cr4(vcpu, sregs->cr4); >> - if (!is_long_mode(vcpu) && is_pae(vcpu)) >> - load_pdptrs(vcpu, vcpu->arch.cr3); >> + kvm_set_cr4(vcpu, sregs->cr4); >> if (mmu_reset_needed) >> kvm_mmu_reset_context(vcpu); >> > > Consider the following: > > current state: > cr3 = 0 > cr4.pae = 0 > > new state: > cr3 = 0x800 > cr4.pae = 1 > > When you call kvm_set_cr3(), it will inject a #GP into the guest because > we are setting bit 11 when cr4.pae=0, which is illegal. However the new > cr4.pae=1, so the new state was in fact legal! > > There are a few ways out, one is to first go back to real mode and set > eveything up carefully in the right order (including EFER.LMA and > EFER.LME, and CS.L). The other is to refactor kvm_set_* so that we have > internal setters which won't trigger these faults (but do need to check > at the end that the state is legal). > > This first method is probably better since that's what the guest does > when booting anyway. Humpf. And something like this? Or GP# instead of triple fault? diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 148cde2..3e63bac 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3986,7 +3986,10 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, vcpu->arch.cr2 = sregs->cr2; mmu_reset_needed |= vcpu->arch.cr3 != sregs->cr3; - vcpu->arch.cr3 = sregs->cr3; + if (gfn_to_memslot(vcpu->kvm, sregs->cr3 >> PAGE_SHIFT)) + vcpu->arch.cr3 = sregs->cr3; + else + set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests); kvm_set_cr8(vcpu, sregs->cr8); -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html