On Sat, 19 May 2018 15:56:38 +1000 Paul Mackerras <paulus@xxxxxxxxxx> wrote: > This relaxes the restriction on using PR KVM on POWER9. The existing > code does work inside a guest partition running in HPT mode, because > hypercalls such as H_ENTER use the old HPTE format, not the new > format used by POWER9, and so no change to PR KVM's HPT manipulation > code is required. PR KVM will still refuse to run if the kernel is > using radix translation or if it is running bare-metal. > > Signed-off-by: Paul Mackerras <paulus@xxxxxxxxxx> > --- Paul, I have built a 4.16.0 kernel + this patch and booted the L1 guest with "disable_radix=on". I could then successfully boot a L2 guest, using the same kernel for simplicity. Both guests using identical fedora28 images. So it seems to be working at first sight. But, if I boot the L2 guest with the default fedora28 kernel, ie 4.16.9-300.fc28.ppc64le, the L2 guest hangs. OF stdout device is: /vdevice/vty@71000000 Preparing to boot Linux version 4.16.9-300.fc28.ppc64le (mockbuild@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) (gcc version 8.1.1 20180502 (Red Hat 8.1.1-1) (GCC)) #1 SMP Thu May 17 04:31:32 UTC 2018 Detected machine type: 0000000000000101 command line: BOOT_IMAGE=/boot/vmlinuz-4.16.9-300.fc28.ppc64le root=UUID=22128c5c-30b1-4e0a-ac16-95853df31131 ro rhgb console=hvc0 early_printk LANG=en_US.UTF-8 Max number of cores passed to firmware: 1024 (NR_CPUS = 1024) Calling ibm,client-architecture-support... done memory layout at init: memory_limit : 0000000000000000 (16 MB aligned) alloc_bottom : 0000000004e70000 alloc_top : 0000000030000000 alloc_top_hi : 0000000100000000 rmo_top : 0000000030000000 ram_top : 0000000100000000 instantiating rtas at 0x000000002fff0000... done prom_hold_cpus: skipped copying OF device tree... Building dt strings... Building dt structure... Device tree strings 0x0000000004e80000 -> 0x0000000004e80aaf Device tree struct 0x0000000004e90000 -> 0x0000000004ea0000 Quiescing Open Firmware ... Booting Linux via __start() @ 0x0000000002000000 ... (qemu) p $pc 0xc000000000026aa0 (qemu) p $lr 0xc000000000119ff4 # addr2line -e /usr/lib/debug/lib/modules/4.16.9-300.fc28.ppc64le/vmlinux 0xc000000000026aa0 /usr/src/debug/kernel-4.16.fc28/linux-4.16.9-300.fc28.ppc64le/./arch/powerpc/include/asm/time.h:115 # addr2line -e /usr/lib/debug/lib/modules/4.16.9-300.fc28.ppc64le/vmlinux 0xc000000000119ff4 /usr/src/debug/kernel-4.16.fc28/linux-4.16.9-300.fc28.ppc64le/kernel/panic.c:300 ie, the final mdelay(PANIC_TIMER_STEP) in panic(). Not sure how to debug this further, any suggestion is welcome :) Cheers, -- Greg > arch/powerpc/kvm/book3s_pr.c | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c > index 67061d3..3d0251e 100644 > --- a/arch/powerpc/kvm/book3s_pr.c > +++ b/arch/powerpc/kvm/book3s_pr.c > @@ -1735,9 +1735,16 @@ static void kvmppc_core_destroy_vm_pr(struct kvm *kvm) > static int kvmppc_core_check_processor_compat_pr(void) > { > /* > - * Disable KVM for Power9 untill the required bits merged. > + * PR KVM can work on POWER9 inside a guest partition > + * running in HPT mode. It can't work if we are using > + * radix translation (because radix provides no way for > + * a process to have unique translations in quadrant 3) > + * or in a bare-metal HPT-mode host (because POWER9 > + * uses a modified HPTE format which the PR KVM code > + * has not been adapted to use). > */ > - if (cpu_has_feature(CPU_FTR_ARCH_300)) > + if (cpu_has_feature(CPU_FTR_ARCH_300) && > + (radix_enabled() || cpu_has_feature(CPU_FTR_HVMODE))) > return -EIO; > return 0; > }