Re: [PATCH] KVM: PPC: Book3S HV: Fix host crash on changing HPT size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 21, 2017 at 03:44:13PM +1000, Paul Mackerras wrote:
> Commit f98a8bf9ee20 ("KVM: PPC: Book3S HV: Allow KVM_PPC_ALLOCATE_HTAB
> ioctl() to change HPT size", 2016-12-20) changed the behaviour of
> the KVM_PPC_ALLOCATE_HTAB ioctl so that it now allocates a new HPT
> and new revmap array if there was a previously-allocated HPT of a
> different size from the size being requested.  In this case, we need
> to reset the rmap arrays of the memslots, because the rmap arrays
> will contain references to HPTEs which are no longer valid.  Worse,
> these references are also references to slots in the new revmap
> array (which parallels the HPT), and the new revmap array contains
> random contents, since it doesn't get zeroed on allocation.
> 
> The effect of having these stale references to slots in the revmap
> array that contain random contents is that subsequent calls to
> functions such as kvmppc_add_revmap_chain will crash because they
> will interpret the non-zero contents of the revmap array as HPTE
> indexes and thus index outside of the revmap array.  This leads to
> host crashes such as the following.
> 
> [ 7072.862122] Unable to handle kernel paging request for data at address 0xd000000c250c00f8
> [ 7072.862218] Faulting instruction address: 0xc0000000000e1c78
> [ 7072.862233] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 7072.862286] SMP NR_CPUS=1024
> [ 7072.862286] NUMA
> [ 7072.862325] PowerNV
> [ 7072.862378] Modules linked in: kvm_hv vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm iw_cxgb3 mlx5_ib ib_core ses enclosure scsi_transport_sas ipmi_powernv ipmi_devintf ipmi_msghandler powernv_op_panel i2c_opal nfsd auth_rpcgss oid_registry
> [ 7072.863085]  nfs_acl lockd grace sunrpc kvm_pr kvm xfs libcrc32c scsi_dh_alua dm_service_time radeon lpfc nvme_fc nvme_fabrics nvme_core scsi_transport_fc i2c_algo_bit tg3 drm_kms_helper ptp pps_core syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm dm_multipath i2c_core cxgb3 mlx5_core mdio [last unloaded: kvm_hv]
> [ 7072.863381] CPU: 72 PID: 56929 Comm: qemu-system-ppc Not tainted 4.12.0-kvm+ #59
> [ 7072.863457] task: c000000fe29e7600 task.stack: c000001e3ffec000
> [ 7072.863520] NIP: c0000000000e1c78 LR: c0000000000e2e3c CTR: c0000000000e25f0
> [ 7072.863596] REGS: c000001e3ffef560 TRAP: 0300   Not tainted  (4.12.0-kvm+)
> [ 7072.863658] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE,TM[E]>
> [ 7072.863667]   CR: 44082882  XER: 20000000
> [ 7072.863767] CFAR: c0000000000e2e38 DAR: d000000c250c00f8 DSISR: 42000000 SOFTE: 1
> GPR00: c0000000000e2e3c c000001e3ffef7e0 c000000001407d00 d000000c250c00f0
> GPR04: d00000006509fb70 d00000000b3d2048 0000000003ffdfb7 0000000000000000
> GPR08: 00000001007fdfb7 00000000c000000f d0000000250c0000 000000000070f7bf
> GPR12: 0000000000000008 c00000000fdad000 0000000010879478 00000000105a0d78
> GPR16: 00007ffaf4080000 0000000000001190 0000000000000000 0000000000010000
> GPR20: 4001ffffff000415 d00000006509fb70 0000000004091190 0000000ee1881190
> GPR24: 0000000003ffdfb7 0000000003ffdfb7 00000000007fdfb7 c000000f5c958000
> GPR28: d00000002d09fb70 0000000003ffdfb7 d00000006509fb70 d00000000b3d2048
> [ 7072.864439] NIP [c0000000000e1c78] kvmppc_add_revmap_chain+0x88/0x130
> [ 7072.864503] LR [c0000000000e2e3c] kvmppc_do_h_enter+0x84c/0x9e0
> [ 7072.864566] Call Trace:
> [ 7072.864594] [c000001e3ffef7e0] [c000001e3ffef830] 0xc000001e3ffef830 (unreliable)
> [ 7072.864671] [c000001e3ffef830] [c0000000000e2e3c] kvmppc_do_h_enter+0x84c/0x9e0
> [ 7072.864751] [c000001e3ffef920] [d00000000b38d878] kvmppc_map_vrma+0x168/0x200 [kvm_hv]
> [ 7072.864831] [c000001e3ffef9e0] [d00000000b38a684] kvmppc_vcpu_run_hv+0x1284/0x1300 [kvm_hv]
> [ 7072.864914] [c000001e3ffefb30] [d00000000f465664] kvmppc_vcpu_run+0x44/0x60 [kvm]
> [ 7072.865008] [c000001e3ffefb60] [d00000000f461864] kvm_arch_vcpu_ioctl_run+0x114/0x290 [kvm]
> [ 7072.865152] [c000001e3ffefbe0] [d00000000f453c98] kvm_vcpu_ioctl+0x598/0x7a0 [kvm]
> [ 7072.865292] [c000001e3ffefd40] [c000000000389328] do_vfs_ioctl+0xd8/0x8c0
> [ 7072.865410] [c000001e3ffefde0] [c000000000389be4] SyS_ioctl+0xd4/0x130
> [ 7072.865526] [c000001e3ffefe30] [c00000000000b760] system_call+0x58/0x6c
> [ 7072.865644] Instruction dump:
> [ 7072.865715] e95b2110 793a0020 7b4926e4 7f8a4a14 409e0098 807c000c 786326e4 7c6a1a14
> [ 7072.865857] 935e0008 7bbd0020 813c000c 913e000c <93a30008> 93bc000c 48000038 60000000
> [ 7072.866001] ---[ end trace 627b6e4bf8080edc ]---
> 
> Note that to trigger this, it is necessary to use a recent upstream
> QEMU (or other userspace that resizes the HPT at CAS time), specify
> a maximum memory size substantially larger than the current memory
> size, and boot a guest kernel that does not support HPT resizing.
> 
> This fixes the problem by resetting the rmap arrays when the old HPT
> is freed.
> 
> Fixes: f98a8bf9ee20 ("KVM: PPC: Book3S HV: Allow KVM_PPC_ALLOCATE_HTAB ioctl() to change HPT size")
> Cc: stable@xxxxxxxxxxxxxxx # v4.11+
> Signed-off-by: Paul Mackerras <paulus@xxxxxxxxxx>

Reviewed-by: David Gibson <david@xxxxxxxxxxxxxxxxxxxxx>

> ---
>  arch/powerpc/kvm/book3s_64_mmu_hv.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> index 710e491..1c10e26 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
> @@ -164,8 +164,10 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, int order)
>  		goto out;
>  	}
>  
> -	if (kvm->arch.hpt.virt)
> +	if (kvm->arch.hpt.virt) {
>  		kvmppc_free_hpt(&kvm->arch.hpt);
> +		kvmppc_rmap_reset(kvm);
> +	}
>  
>  	err = kvmppc_allocate_hpt(&info, order);
>  	if (err < 0)

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [KVM Development]     [KVM ARM]     [KVM ia64]     [Linux Virtualization]     [Linux USB Devel]     [Linux Video]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux