Re: [PATCH] KVM: PPC: Book3S HV: fix a oops in kvmppc_uvmem_page_free()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 30, 2020 at 04:25:26PM -0700, Ram Pai wrote:
> Observed the following oops while stress-testing, using multiple
> secureVM on a distro kernel. However this issue theoritically exists in
> 5.5 kernel and later.
> 
> This issue occurs when the total number of requested device-PFNs exceed
> the total-number of available device-PFNs.  PFN migration fails to
> allocate a device-pfn, which causes migrate_vma_finalize() to trigger
> kvmppc_uvmem_page_free() on a page, that is not associated with any
> device-pfn.  kvmppc_uvmem_page_free() blindly tries to access the
> contents of the private data which can be null, leading to the following
> kernel fault.
> 
>  --------------------------------------------------------------------------
>  Unable to handle kernel paging request for data at address 0x00000011
>  Faulting instruction address: 0xc00800000e36e110
>  Oops: Kernel access of bad area, sig: 11 [#1]
>  LE SMP NR_CPUS=2048 NUMA PowerNV
> ....
>  MSR:  900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>
> 		 CR: 24424822  XER: 00000000
>  CFAR: c000000000e3d764 DAR: 0000000000000011 DSISR: 40000000 IRQMASK: 0
>  GPR00: c00800000e36e0a4 c000001f1d59f610 c00800000e38a400 0000000000000000
>  GPR04: c000001fa5000000 fffffffffffffffe ffffffffffffffff c000201fffeaf300
>  GPR08: 00000000000001f0 0000000000000000 0000000000000f80 c00800000e373608
>  GPR12: c000000000e3d710 c000201fffeaf300 0000000000000001 00007fef87360000
>  GPR16: 00007fff97db4410 c000201c3b66a578 ffffffffffffffff 0000000000000000
>  GPR20: 0000000119db9ad0 000000000000000a fffffffffffffffc 0000000000000001
>  GPR24: c000201c3b660000 c000001f1d59f7a0 c0000000004cffb0 0000000000000001
>  GPR28: 0000000000000000 c00a001ff003e000 c00800000e386150 0000000000000f80
>  NIP [c00800000e36e110] kvmppc_uvmem_page_free+0xc8/0x210 [kvm_hv]
>  LR [c00800000e36e0a4] kvmppc_uvmem_page_free+0x5c/0x210 [kvm_hv]
>  Call Trace:
>  [c000000000512010] free_devmap_managed_page+0xd0/0x100
>  [c0000000003f71d0] put_devmap_managed_page+0xa0/0xc0
>  [c0000000004d24bc] migrate_vma_finalize+0x32c/0x410
>  [c00800000e36e828] kvmppc_svm_page_in.constprop.5+0xa0/0x460 [kvm_hv]
>  [c00800000e36eddc] kvmppc_uv_migrate_mem_slot.isra.2+0x1f4/0x230 [kvm_hv]
>  [c00800000e36fa98] kvmppc_h_svm_init_done+0x90/0x170 [kvm_hv]
>  [c00800000e35bb14] kvmppc_pseries_do_hcall+0x1ac/0x10a0 [kvm_hv]
>  [c00800000e35edf4] kvmppc_vcpu_run_hv+0x83c/0x1060 [kvm_hv]
>  [c00800000e95eb2c] kvmppc_vcpu_run+0x34/0x48 [kvm]
>  [c00800000e95a2dc] kvm_arch_vcpu_ioctl_run+0x374/0x830 [kvm]
>  [c00800000e9433b4] kvm_vcpu_ioctl+0x45c/0x7c0 [kvm]
>  [c0000000005451d0] do_vfs_ioctl+0xe0/0xaa0
>  [c000000000545d64] sys_ioctl+0xc4/0x160
>  [c00000000000b408] system_call+0x5c/0x70
>  Instruction dump:
>  a12d1174 2f890000 409e0158 a1271172 3929ffff b1271172 7c2004ac 39200000
>  913e0140 39200000 e87d0010 f93d0010 <89230011> e8c30000 e9030008 2f890000
>  --------------------------------------------------------------------------
> 
>  Fix the oops..
> 
> fixes: ca9f49 ("KVM: PPC: Book3S HV: Support for running secure guests")
> Signed-off-by: Ram Pai <linuxram@xxxxxxxxxx>
> ---
>  arch/powerpc/kvm/book3s_hv_uvmem.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
> index 2806983..f4002bf 100644
> --- a/arch/powerpc/kvm/book3s_hv_uvmem.c
> +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
> @@ -1018,13 +1018,15 @@ static void kvmppc_uvmem_page_free(struct page *page)
>  {
>  	unsigned long pfn = page_to_pfn(page) -
>  			(kvmppc_uvmem_pgmap.res.start >> PAGE_SHIFT);
> -	struct kvmppc_uvmem_page_pvt *pvt;
> +	struct kvmppc_uvmem_page_pvt *pvt = page->zone_device_data;
> +
> +	if (!pvt)
> +		return;
>  
>  	spin_lock(&kvmppc_uvmem_bitmap_lock);
>  	bitmap_clear(kvmppc_uvmem_bitmap, pfn, 1);
>  	spin_unlock(&kvmppc_uvmem_bitmap_lock);
>  
> -	pvt = page->zone_device_data;
>  	page->zone_device_data = NULL;
>  	if (pvt->remove_gfn)
>  		kvmppc_gfn_remove(pvt->gpa >> PAGE_SHIFT, pvt->kvm);

In our case, device pages that are in use are always associated with a valid
pvt member. See kvmppc_uvmem_get_page() which returns failure if it
runs out of device pfns and that will result in proper failure of
page-in calls.

For the case where we run out of device pfns, migrate_vma_finalize() will
restore the original PTE and will not replace the PTE with device private PTE.

Also kvmppc_uvmem_page_free() (=dev_pagemap_ops.page_free()) is never
called for non-device-private pages.

This could be a use-after-free case possibly arising out of the new state
changes in HV. If so, this fix will only mask the bug and not address the
original problem.

Regards,
Bharata.



[Index of Archives]     [KVM Development]     [KVM ARM]     [KVM ia64]     [Linux Virtualization]     [Linux USB Devel]     [Linux Video]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux