Nicholas Piggin <npiggin@xxxxxxxxx> writes: > Excerpts from Fabiano Rosas's message of January 20, 2021 7:07 am: >> Nicholas Piggin <npiggin@xxxxxxxxx> writes: >> >>> Excerpts from Fabiano Rosas's message of January 19, 2021 11:46 am: >>>> Resending because the previous got spam-filtered: >>>> >>>> Nicholas Piggin <npiggin@xxxxxxxxx> writes: >>>> >>>>> This reverts much of commit c01015091a770 ("KVM: PPC: Book3S HV: Run HPT >>>>> guests on POWER9 radix hosts"), which was required to run HPT guests on >>>>> RPT hosts on early POWER9 CPUs without support for "mixed mode", which >>>>> meant the host could not run with MMU on while guests were running. >>>>> >>>>> This code has some corner case bugs, e.g., when the guest hits a machine >>>>> check or HMI the primary locks up waiting for secondaries to switch LPCR >>>>> to host, which they never do. This could all be fixed in software, but >>>>> most CPUs in production have mixed mode support, and those that don't >>>>> are believed to be all in installations that don't use this capability. >>>>> So simplify things and remove support. >>>> >>>> With this patch in a DD2.1 machine + indep_threads_mode=N + >>>> disable_radix, QEMU aborts and dumps registers, is that intended? >>> >>> Yes. That configuration is hanging handling MCEs in the guest with some >>> threads waiting forever to synchronize. Paul suggested it was never a >>> supported configuration so we might just remove it. >>> >> >> OK, so: >> >> Tested-by: Fabiano Rosas <farosas@xxxxxxxxxxxxx> >> >>>> Could we use the 'no_mixing_hpt_and_radix' logic in check_extension to >>>> advertise only KVM_CAP_PPC_MMU_RADIX to the guest via OV5 so it doesn't >>>> try to run hash? >>>> >>>> For instance, if I hack QEMU's 'spapr_dt_ov5_platform_support' from >>>> OV5_MMU_BOTH to OV5_MMU_RADIX_300 then it boots succesfuly, but the >>>> guest turns into radix, due to this code in prom_init: >>>> >>>> prom_parse_mmu_model: >>>> >>>> case OV5_FEAT(OV5_MMU_RADIX): /* Only Radix */ >>>> prom_debug("MMU - radix only\n"); >>>> if (prom_radix_disable) { >>>> /* >>>> * If we __have__ to do radix, we're better off ignoring >>>> * the command line rather than not booting. >>>> */ >>>> prom_printf("WARNING: Ignoring cmdline option disable_radix\n"); >>>> } >>>> support->radix_mmu = true; >>>> break; >>>> >>>> It seems we could explicitly say that the host does not support hash and >>>> that would align with the above code. >>> >>> I'm not sure, sounds like you could, on the other hand these aborts seem >>> like the prefered failure mode for these kinds of configuration issues, >>> I don't know what the policy is, is reverting back to radix acceptable? >>> >> >> Yeah, I couldn't find documentation about why we're reverting back to >> radix. I personally dislike it, but there is already a precedent so I'm >> not sure. A radix guest on a hash host does the same transparent >> conversion AFAICT. >> >> But despite that, this patch removes support for hash MMU in this >> particular scenario. I don't see why continuing to tell the guest we >> support hash. >> >> Anyway, here's a patch if you decide to go that way (tested w/ DD2.1 & >> 2.3 machines): > > Thanks, I don't mind it, have to see if the maintainer will take it :) > > You could add a small changelog / SOB and I could putit after my patch? > Sure, I'll reply to this thread with a proper patch. >> >> --- >> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h >> index 0a056c64c317b..53743555676d6 100644 >> --- a/arch/powerpc/include/asm/kvm_ppc.h >> +++ b/arch/powerpc/include/asm/kvm_ppc.h >> @@ -314,6 +314,7 @@ struct kvmppc_ops { >> int size); >> int (*enable_svm)(struct kvm *kvm); >> int (*svm_off)(struct kvm *kvm); >> + bool (*hash_possible)(void); >> }; >> >> extern struct kvmppc_ops *kvmppc_hv_ops; >> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c >> index 6f612d240392f..2d1e8aba22b85 100644 >> --- a/arch/powerpc/kvm/book3s_hv.c >> +++ b/arch/powerpc/kvm/book3s_hv.c >> @@ -5599,6 +5599,15 @@ static int kvmhv_svm_off(struct kvm *kvm) >> return ret; >> } >> >> +static bool kvmppc_hash_possible(void) >> +{ >> + if (radix_enabled() && no_mixing_hpt_and_radix) >> + return false; >> + >> + return cpu_has_feature(CPU_FTR_ARCH_300) && >> + cpu_has_feature(CPU_FTR_HVMODE); >> +} > > Just be careful, it's hash_v3 specifically. Either make this return true > for arch < 300 add the ARCH_300 check in the ioctl, or rename to include > v3. > >> + >> static struct kvmppc_ops kvm_ops_hv = { >> .get_sregs = kvm_arch_vcpu_ioctl_get_sregs_hv, >> .set_sregs = kvm_arch_vcpu_ioctl_set_sregs_hv, >> @@ -5642,6 +5651,7 @@ static struct kvmppc_ops kvm_ops_hv = { >> .store_to_eaddr = kvmhv_store_to_eaddr, >> .enable_svm = kvmhv_enable_svm, >> .svm_off = kvmhv_svm_off, >> + .hash_possible = kvmppc_hash_possible, >> }; >> > > How about adding an op which can check extensions? It could return false > if it wasn't checked and so default to the generic checks in > kvm_vm_ioctl_check_extension, and take a pointer to 'r' to set. > I'm not sure I get the part about "return false if it wasn't checked". Do you mean like this? static bool kvmhv_check_extension(long ext, int *r) { switch (ext) { case KVM_CAP_PPC_MMU_HASH_V3: r = kvmppc_hash_v3_possible(); break; default: return false; } return true; } And then we could move all of the #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE cases into it and early in kvm_vm_ioctl_check_extension have something like: if (hv_enabled && kvmppc_hv_ops->check_extension(ext, &r)) return r; >> static int kvm_init_subcore_bitmap(void) >> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c >> index cf52d26f49cd7..99ced6c570e74 100644 >> --- a/arch/powerpc/kvm/powerpc.c >> +++ b/arch/powerpc/kvm/powerpc.c >> @@ -611,8 +611,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) >> r = !!(hv_enabled && radix_enabled()); >> break; >> case KVM_CAP_PPC_MMU_HASH_V3: >> - r = !!(hv_enabled && cpu_has_feature(CPU_FTR_ARCH_300) && >> - cpu_has_feature(CPU_FTR_HVMODE)); >> + r = !!(hv_enabled && kvmppc_hv_ops->hash_possible()); >> break; >> case KVM_CAP_PPC_NESTED_HV: >> r = !!(hv_enabled && kvmppc_hv_ops->enable_nested && > > Thanks, > Nick