Re: [PATCH v2] KVM: VMX: Fix crash due to uninitialized current_vmcs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 23, 2023, Alexandru Matei wrote:
> On 1/23/2023 3:24 PM, Vitaly Kuznetsov wrote:
> > Alexandru Matei <alexandru.matei@xxxxxxxxxx> writes:
> > 
> >> KVM enables 'Enlightened VMCS' and 'Enlightened MSR Bitmap' when running as
> >> a nested hypervisor on top of Hyper-V. When MSR bitmap is updated,
> >> evmcs_touch_msr_bitmap function uses current_vmcs per-cpu variable to mark
> >> that the msr bitmap was changed.
> >>
> >> vmx_vcpu_create() modifies the msr bitmap via vmx_disable_intercept_for_msr
> >> -> vmx_msr_bitmap_l01_changed which in the end calls this function. The
> >> function checks for current_vmcs if it is null but the check is
> >> insufficient because current_vmcs is not initialized. Because of this, the
> >> code might incorrectly write to the structure pointed by current_vmcs value
> >> left by another task. Preemption is not disabled, the current task can be
> >> preempted and moved to another CPU while current_vmcs is accessed multiple
> >> times from evmcs_touch_msr_bitmap() which leads to crash.
> >>
> >> The manipulation of MSR bitmaps by callers happens only for vmcs01 so the
> >> solution is to use vmx->vmcs01.vmcs instead of current_vmcs.
> >>
> >> BUG: kernel NULL pointer dereference, address: 0000000000000338
> >> PGD 4e1775067 P4D 0
> >> Oops: 0002 [#1] PREEMPT SMP NOPTI
> >> ...
> >> RIP: 0010:vmx_msr_bitmap_l01_changed+0x39/0x50 [kvm_intel]
> >> ...
> >> Call Trace:
> >>  vmx_disable_intercept_for_msr+0x36/0x260 [kvm_intel]
> >>  vmx_vcpu_create+0xe6/0x540 [kvm_intel]
> >>  ? __vmalloc_node+0x4a/0x70
> >>  kvm_arch_vcpu_create+0x1d1/0x2e0 [kvm]
> >>  kvm_vm_ioctl_create_vcpu+0x178/0x430 [kvm]
> >>  ? __handle_mm_fault+0x3cb/0x750
> >>  kvm_vm_ioctl+0x53f/0x790 [kvm]
> >>  ? syscall_exit_work+0x11a/0x150
> >>  ? syscall_exit_to_user_mode+0x12/0x30
> >>  ? do_syscall_64+0x69/0x90
> >>  ? handle_mm_fault+0xc5/0x2a0
> >>  __x64_sys_ioctl+0x8a/0xc0
> >>  do_syscall_64+0x5c/0x90
> >>  ? exc_page_fault+0x62/0x150
> >>  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> >>
> >> Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> >> Signed-off-by: Alexandru Matei <alexandru.matei@xxxxxxxxxx>
> >> ---
> >> v2:
> >>   - pass (e)vmcs01 to evmcs_touch_msr_bitmap
> >>   - use loaded_vmcs * instead of vcpu_vmx * to avoid
> >>     including vmx.h which generates circular dependency
> >>
> >> v1: https://lore.kernel.org/kvm/20230118141348.828-1-alexandru.matei@xxxxxxxxxx/
> >>
> >>  arch/x86/kvm/vmx/hyperv.h | 16 +++++++++++-----
> >>  arch/x86/kvm/vmx/vmx.c    |  2 +-
> >>  2 files changed, 12 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/arch/x86/kvm/vmx/hyperv.h b/arch/x86/kvm/vmx/hyperv.h
> >> index 571e7929d14e..132e32e57d2d 100644
> >> --- a/arch/x86/kvm/vmx/hyperv.h
> >> +++ b/arch/x86/kvm/vmx/hyperv.h
> >> @@ -190,13 +190,19 @@ static inline u16 evmcs_read16(unsigned long field)
> >>  	return *(u16 *)((char *)current_evmcs + offset);
> >>  }
> >>  
> >> -static inline void evmcs_touch_msr_bitmap(void)
> >> +static inline void evmcs_touch_msr_bitmap(struct loaded_vmcs *vmcs01)
> > 
> > Personally, I would've followed Sean's suggestion and passed "struct
> > vcpu_vmx *vmx" as 'loaded_vmcs' here is a bit ambiguos....
> 
> I couldn't use vcpu_vmx *, it requires including vmx.h in hyperv.h 
> and it generates a circular depedency which leads to compile errors.
> 
> > 
> >>  {
> >> -	if (unlikely(!current_evmcs))
> >> +	/*
> >> +	 * Enlightened MSR Bitmap feature is enabled only for L1, i.e.
> >> +	 * always operates on (e)vmcs01
> >> +	 */
> >> +	struct hv_enlightened_vmcs* evmcs = (void*)vmcs01->vmcs;
> > 
> > (Nit: a space between "void" and "*")
> > 
> > .. or, alternatively, you can directly pass 
> > (struct hv_enlightened_vmcs *)vmx->vmcs01.vmcs as e.g. 'current_evmcs'
> > and avoid the conversion here.
> 
> OK, sounds good, I'll pass hv_enlightened_vmcs * directly.

Passing the eVMCS is silly, if we're going to bleed eVMCS details into vmx.c then
we should just commit and expose all details.  For this feature specifically, KVM
already handles the enabling in vmx.c / vmx_vcpu_create():

	if (IS_ENABLED(CONFIG_HYPERV) && static_branch_unlikely(&enable_evmcs) &&
	    (ms_hyperv.nested_features & HV_X64_NESTED_MSR_BITMAP)) {
		struct hv_enlightened_vmcs *evmcs = (void *)vmx->vmcs01.vmcs;

		evmcs->hv_enlightenments_control.msr_bitmap = 1;
	}

And if we handle this fully in vmx_msr_bitmap_l01_changed(), then there's no need
for a comment explaining that the feature is only enabled for vmcs01.

If we want to maintain better separate between VMX and Hyper-V code, then just make
the helper non-inline in hyperv.c, modifying MSR bitmaps will never be a hot path
for any sane guest.

I don't think I have a strong preference either way.  In a perfect world we'd keep
Hyper-V code separate, but practically speaking I think trying to move everything
into hyperv.c would result in far too many stubs and some weird function names.

Side topic, we should really have a wrapper for static_branch_unlikely(&enable_evmcs)
so that the static key can be defined iff CONFIG_HYPERV=y.  I'll send a patch.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux