Observed the following oops while stress-testing, using multiple secureVM on a distro kernel. However this issue theoritically exists in 5.5 kernel and later. This issue occurs when the total number of requested device-PFNs exceed the total-number of available device-PFNs. PFN migration fails to allocate a device-pfn, which causes migrate_vma_finalize() to trigger kvmppc_uvmem_page_free() on a page, that is not associated with any device-pfn. kvmppc_uvmem_page_free() blindly tries to access the contents of the private data which can be null, leading to the following kernel fault. -------------------------------------------------------------------------- Unable to handle kernel paging request for data at address 0x00000011 Faulting instruction address: 0xc00800000e36e110 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV .... MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24424822 XER: 00000000 CFAR: c000000000e3d764 DAR: 0000000000000011 DSISR: 40000000 IRQMASK: 0 GPR00: c00800000e36e0a4 c000001f1d59f610 c00800000e38a400 0000000000000000 GPR04: c000001fa5000000 fffffffffffffffe ffffffffffffffff c000201fffeaf300 GPR08: 00000000000001f0 0000000000000000 0000000000000f80 c00800000e373608 GPR12: c000000000e3d710 c000201fffeaf300 0000000000000001 00007fef87360000 GPR16: 00007fff97db4410 c000201c3b66a578 ffffffffffffffff 0000000000000000 GPR20: 0000000119db9ad0 000000000000000a fffffffffffffffc 0000000000000001 GPR24: c000201c3b660000 c000001f1d59f7a0 c0000000004cffb0 0000000000000001 GPR28: 0000000000000000 c00a001ff003e000 c00800000e386150 0000000000000f80 NIP [c00800000e36e110] kvmppc_uvmem_page_free+0xc8/0x210 [kvm_hv] LR [c00800000e36e0a4] kvmppc_uvmem_page_free+0x5c/0x210 [kvm_hv] Call Trace: [c000000000512010] free_devmap_managed_page+0xd0/0x100 [c0000000003f71d0] put_devmap_managed_page+0xa0/0xc0 [c0000000004d24bc] migrate_vma_finalize+0x32c/0x410 [c00800000e36e828] kvmppc_svm_page_in.constprop.5+0xa0/0x460 [kvm_hv] [c00800000e36eddc] kvmppc_uv_migrate_mem_slot.isra.2+0x1f4/0x230 [kvm_hv] [c00800000e36fa98] kvmppc_h_svm_init_done+0x90/0x170 [kvm_hv] [c00800000e35bb14] kvmppc_pseries_do_hcall+0x1ac/0x10a0 [kvm_hv] [c00800000e35edf4] kvmppc_vcpu_run_hv+0x83c/0x1060 [kvm_hv] [c00800000e95eb2c] kvmppc_vcpu_run+0x34/0x48 [kvm] [c00800000e95a2dc] kvm_arch_vcpu_ioctl_run+0x374/0x830 [kvm] [c00800000e9433b4] kvm_vcpu_ioctl+0x45c/0x7c0 [kvm] [c0000000005451d0] do_vfs_ioctl+0xe0/0xaa0 [c000000000545d64] sys_ioctl+0xc4/0x160 [c00000000000b408] system_call+0x5c/0x70 Instruction dump: a12d1174 2f890000 409e0158 a1271172 3929ffff b1271172 7c2004ac 39200000 913e0140 39200000 e87d0010 f93d0010 <89230011> e8c30000 e9030008 2f890000 -------------------------------------------------------------------------- Fix the oops.. fixes: ca9f49 ("KVM: PPC: Book3S HV: Support for running secure guests") Signed-off-by: Ram Pai <linuxram@xxxxxxxxxx> --- arch/powerpc/kvm/book3s_hv_uvmem.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c index 2806983..f4002bf 100644 --- a/arch/powerpc/kvm/book3s_hv_uvmem.c +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c @@ -1018,13 +1018,15 @@ static void kvmppc_uvmem_page_free(struct page *page) { unsigned long pfn = page_to_pfn(page) - (kvmppc_uvmem_pgmap.res.start >> PAGE_SHIFT); - struct kvmppc_uvmem_page_pvt *pvt; + struct kvmppc_uvmem_page_pvt *pvt = page->zone_device_data; + + if (!pvt) + return; spin_lock(&kvmppc_uvmem_bitmap_lock); bitmap_clear(kvmppc_uvmem_bitmap, pfn, 1); spin_unlock(&kvmppc_uvmem_bitmap_lock); - pvt = page->zone_device_data; page->zone_device_data = NULL; if (pvt->remove_gfn) kvmppc_gfn_remove(pvt->gpa >> PAGE_SHIFT, pvt->kvm); -- 1.8.3.1