On 07/11/2018 13:58, Liran Alon wrote: > > >> On 7 Nov 2018, at 14:47, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: >> >> On 07/11/2018 13:10, Alexander Potapenko wrote: >>> This appears to be a real bug in KVM. >>> Please see a simplified reproducer attached. >> >> Thanks, I agree it's a reael bug. The basic issue is that the >> kvm_state->size member is too small (1040) in the KVM_SET_NESTED_STATE >> ioctl, aka 0x4080aebf. >> >> One way to fix it would be to just change kmalloc to kzalloc when >> allocating cached_vmcs12 and cached_shadow_vmcs12, but really the ioctl >> is wrong and should be rejected. And the case where a shadow VMCS has >> to be loaded is even more wrong, and we have to fix it anyway, so I >> don't really like the idea of papering over the bug in the allocation. >> >> I'll test this patch and submit it formally: >> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> index c645f777b425..c546f0b1f3e0 100644 >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -14888,10 +14888,13 @@ static int vmx_set_nested_state(struct >> kvm_vcpu *vcpu, >> if (ret) >> return ret; >> >> - /* Empty 'VMXON' state is permitted */ >> - if (kvm_state->size < sizeof(kvm_state) + sizeof(*vmcs12)) >> + /* Empty 'VMXON' state is permitted. A partial VMCS12 is not. */ >> + if (kvm_state->size == sizeof(kvm_state)) >> return 0; >> >> + if (kvm_state->size < sizeof(kvm_state) + VMCS12_SIZE) >> + return -EINVAL; >> + > > I don’t think that this test is sufficient to fully resolve issue. > What if malicious userspace supplies valid size but pages containing nested_state->vmcs12 is unmapped? > This will result in vmx_set_nested_state() still calling set_current_vmptr() but failing on copy_from_user() > which still leaks cached_vmcs12 on next VMPTRLD of guest. Makes sense; since SET_NESTED_STATE is not a fast path, we can just memdup_user and pass a kernel pointer to vmx_set_nested_state. > Therefore, I think that the correct patch should be to change vmx_set_nested_state() to > first gather all relevant information from userspace and validate it, > and only then start applying it to KVM’s internal vCPU state. > >> if (kvm_state->vmx.vmcs_pa != -1ull) { >> if (kvm_state->vmx.vmcs_pa == kvm_state->vmx.vmxon_pa || >> !page_address_valid(vcpu, kvm_state->vmx.vmcs_pa)) >> @@ -14917,6 +14920,7 @@ static int vmx_set_nested_state(struct kvm_vcpu >> *vcpu, >> } >> >> vmcs12 = get_vmcs12(vcpu); >> + BUILD_BUG_ON(sizeof(*vmcs12) > VMCS12_SIZE); > > Why put this BUILD_BUG_ON() specifically here? > There are many places which assumes cached_vmcs12 is of size VMCS12_SIZE. > (Such as nested_release_vmcs12() and handle_vmptrld()). Unlike those places, here the copy has sizeof(*vmcs12) bytes and an overflow would cause a userspace write to kernel memory. Though, that means there is still a possibility of leaking kernel data when nested_release_vmcs12 is called. So it also makes sense to use VMCS12_SIZE for the memory copies, and kzalloc. Thanks, Paolo >> if (copy_from_user(vmcs12, user_kvm_nested_state->data, sizeof(*vmcs12))) >> return -EFAULT; >> >> @@ -14932,7 +14936,7 @@ static int vmx_set_nested_state(struct kvm_vcpu >> *vcpu, >> if (nested_cpu_has_shadow_vmcs(vmcs12) && >> vmcs12->vmcs_link_pointer != -1ull) { >> struct vmcs12 *shadow_vmcs12 = get_shadow_vmcs12(vcpu); >> - if (kvm_state->size < sizeof(kvm_state) + 2 * sizeof(*vmcs12)) >> + if (kvm_state->size < sizeof(kvm_state) + 2 * VMCS12_SIZE) >> return -EINVAL; >> >> if (copy_from_user(shadow_vmcs12, >> >> Paolo > > -Liran > >