Excerpts from Fabiano Rosas's message of January 20, 2021 7:07 am: > Nicholas Piggin <npiggin@xxxxxxxxx> writes: > >> Excerpts from Fabiano Rosas's message of January 19, 2021 11:46 am: >>> Resending because the previous got spam-filtered: >>> >>> Nicholas Piggin <npiggin@xxxxxxxxx> writes: >>> >>>> This reverts much of commit c01015091a770 ("KVM: PPC: Book3S HV: Run HPT >>>> guests on POWER9 radix hosts"), which was required to run HPT guests on >>>> RPT hosts on early POWER9 CPUs without support for "mixed mode", which >>>> meant the host could not run with MMU on while guests were running. >>>> >>>> This code has some corner case bugs, e.g., when the guest hits a machine >>>> check or HMI the primary locks up waiting for secondaries to switch LPCR >>>> to host, which they never do. This could all be fixed in software, but >>>> most CPUs in production have mixed mode support, and those that don't >>>> are believed to be all in installations that don't use this capability. >>>> So simplify things and remove support. >>> >>> With this patch in a DD2.1 machine + indep_threads_mode=N + >>> disable_radix, QEMU aborts and dumps registers, is that intended? >> >> Yes. That configuration is hanging handling MCEs in the guest with some >> threads waiting forever to synchronize. Paul suggested it was never a >> supported configuration so we might just remove it. >> > > OK, so: > > Tested-by: Fabiano Rosas <farosas@xxxxxxxxxxxxx> > >>> Could we use the 'no_mixing_hpt_and_radix' logic in check_extension to >>> advertise only KVM_CAP_PPC_MMU_RADIX to the guest via OV5 so it doesn't >>> try to run hash? >>> >>> For instance, if I hack QEMU's 'spapr_dt_ov5_platform_support' from >>> OV5_MMU_BOTH to OV5_MMU_RADIX_300 then it boots succesfuly, but the >>> guest turns into radix, due to this code in prom_init: >>> >>> prom_parse_mmu_model: >>> >>> case OV5_FEAT(OV5_MMU_RADIX): /* Only Radix */ >>> prom_debug("MMU - radix only\n"); >>> if (prom_radix_disable) { >>> /* >>> * If we __have__ to do radix, we're better off ignoring >>> * the command line rather than not booting. >>> */ >>> prom_printf("WARNING: Ignoring cmdline option disable_radix\n"); >>> } >>> support->radix_mmu = true; >>> break; >>> >>> It seems we could explicitly say that the host does not support hash and >>> that would align with the above code. >> >> I'm not sure, sounds like you could, on the other hand these aborts seem >> like the prefered failure mode for these kinds of configuration issues, >> I don't know what the policy is, is reverting back to radix acceptable? >> > > Yeah, I couldn't find documentation about why we're reverting back to > radix. I personally dislike it, but there is already a precedent so I'm > not sure. A radix guest on a hash host does the same transparent > conversion AFAICT. > > But despite that, this patch removes support for hash MMU in this > particular scenario. I don't see why continuing to tell the guest we > support hash. > > Anyway, here's a patch if you decide to go that way (tested w/ DD2.1 & > 2.3 machines): Thanks, I don't mind it, have to see if the maintainer will take it :) You could add a small changelog / SOB and I could putit after my patch? > > --- > diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h > index 0a056c64c317b..53743555676d6 100644 > --- a/arch/powerpc/include/asm/kvm_ppc.h > +++ b/arch/powerpc/include/asm/kvm_ppc.h > @@ -314,6 +314,7 @@ struct kvmppc_ops { > int size); > int (*enable_svm)(struct kvm *kvm); > int (*svm_off)(struct kvm *kvm); > + bool (*hash_possible)(void); > }; > > extern struct kvmppc_ops *kvmppc_hv_ops; > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c > index 6f612d240392f..2d1e8aba22b85 100644 > --- a/arch/powerpc/kvm/book3s_hv.c > +++ b/arch/powerpc/kvm/book3s_hv.c > @@ -5599,6 +5599,15 @@ static int kvmhv_svm_off(struct kvm *kvm) > return ret; > } > > +static bool kvmppc_hash_possible(void) > +{ > + if (radix_enabled() && no_mixing_hpt_and_radix) > + return false; > + > + return cpu_has_feature(CPU_FTR_ARCH_300) && > + cpu_has_feature(CPU_FTR_HVMODE); > +} Just be careful, it's hash_v3 specifically. Either make this return true for arch < 300 add the ARCH_300 check in the ioctl, or rename to include v3. > + > static struct kvmppc_ops kvm_ops_hv = { > .get_sregs = kvm_arch_vcpu_ioctl_get_sregs_hv, > .set_sregs = kvm_arch_vcpu_ioctl_set_sregs_hv, > @@ -5642,6 +5651,7 @@ static struct kvmppc_ops kvm_ops_hv = { > .store_to_eaddr = kvmhv_store_to_eaddr, > .enable_svm = kvmhv_enable_svm, > .svm_off = kvmhv_svm_off, > + .hash_possible = kvmppc_hash_possible, > }; > How about adding an op which can check extensions? It could return false if it wasn't checked and so default to the generic checks in kvm_vm_ioctl_check_extension, and take a pointer to 'r' to set. > static int kvm_init_subcore_bitmap(void) > diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c > index cf52d26f49cd7..99ced6c570e74 100644 > --- a/arch/powerpc/kvm/powerpc.c > +++ b/arch/powerpc/kvm/powerpc.c > @@ -611,8 +611,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > r = !!(hv_enabled && radix_enabled()); > break; > case KVM_CAP_PPC_MMU_HASH_V3: > - r = !!(hv_enabled && cpu_has_feature(CPU_FTR_ARCH_300) && > - cpu_has_feature(CPU_FTR_HVMODE)); > + r = !!(hv_enabled && kvmppc_hv_ops->hash_possible()); > break; > case KVM_CAP_PPC_NESTED_HV: > r = !!(hv_enabled && kvmppc_hv_ops->enable_nested && Thanks, Nick