Hello, On the POWER9 processor, the XIVE interrupt controller can control interrupt sources using MMIO to trigger events, to EOI or to turn off the sources. Priority management and interrupt acknowledgment is also controlled by MMIO in the CPU presenter subengine. PowerNV/baremetal Linux runs natively under XIVE but sPAPR guests need special support from the hypervisor to do the same. This is called the XIVE native exploitation mode and today, it can be activated under the PowerPC Hypervisor, pHyp. However, Linux/KVM lacks XIVE native support and still offers the old interrupt mode interface using a XICS-over-XIVE glue which implements the XICS hcalls. The following series is proposal to add the same support under KVM. * KVM XIVE for sPAPR A new KVM device is introduced for the XIVE native exploitation mode. It reuses most of the XICS-over-XIVE glue implementation structures which are internal to KVM but it has a completely different interface. A set of Hypervisor calls configures the sources and the event queues and from there, all control is done by the guest through MMIOs. These MMIO regions (ESB and TIMA) are exposed to guests in QEMU, similarly to VFIO, and the associated VMAs are populated dynamically with the appropriate pages using a fault handler. This is implemented with a couple of KVM device ioctls. On a POWER9 sPAPR machine, the Client Architecture Support (CAS) negotiation process determines whether the guest operates with a interrupt controller using the XICS legacy model, as found on POWER8, or in XIVE exploitation mode. Which means that the KVM interrupt device should be created at runtime, after the machine as started. This requires extra KVM support to create/destroy KVM devices. The last patches are an experimental attempt to solve that problem. Migration raises a couple of issues also. The patchset provide the necessary accessor routines to capture and restore the state of the different structures used by KVM, OPAL and HW. But as the migration is sequenced by QEMU, we might not have enough quiescence points to capture correctly all HW state. * Caveats - VMs migrate under load. work in progress. - reseting the KVM device has some bad consequences on the MMU. Needs more care. - not much attention given to pass-through * Github QEMU https://github.com/legoater/qemu/commits/xive Linux/KVM https://github.com/legoater/linux/commits/xive Thanks, C. Cédric Le Goater (16): powerpc/xive: export flags for the XIVE native exploitation mode hcalls powerpc/xive: add OPAL extensions for the XIVE native exploitation support KVM: PPC: Book3S HV: check the IRQ controller type KVM: PPC: Book3S HV: export services for the XIVE native exploitation device KVM: PPC: Book3S HV: add a new KVM device for the XIVE native exploitation mode KVM: PPC: Book3S HV: add a SET_SOURCE control to the XIVE native device KVM: PPC: Book3S HV: add a GET_ESB_FD control to the XIVE native device KVM: PPC: Book3S HV: add a GET_TIMA_FD control to XIVE native device KVM: PPC: Book3S HV: add a VC_BASE control to the XIVE native device KVM: PPC: Book3S HV: add a EISN attribute to kvmppc_xive_irq_state KVM: PPC: Book3S HV: add support for the XIVE native exploitation mode hcalls powerpc/xive: update HW definitions powerpc/xive: record guest queue page address KVM: PPC: Book3S HV: add support for XIVE native migration KVM: introduce a KVM_DESTROY_DEVICE ioctl KVM: PPC: Book3S HV: disconnect vCPU from IRQ device arch/powerpc/include/asm/kvm_host.h | 2 + arch/powerpc/include/asm/kvm_ppc.h | 78 +- arch/powerpc/include/asm/opal-api.h | 13 +- arch/powerpc/include/asm/opal.h | 12 + arch/powerpc/include/asm/xive-regs.h | 45 + arch/powerpc/include/asm/xive.h | 43 + arch/powerpc/include/uapi/asm/kvm.h | 22 + arch/powerpc/kvm/Makefile | 4 +- arch/powerpc/kvm/book3s.c | 53 +- arch/powerpc/kvm/book3s_hv.c | 31 + arch/powerpc/kvm/book3s_hv_builtin.c | 196 ++++ arch/powerpc/kvm/book3s_hv_rm_xive_native.c | 47 + arch/powerpc/kvm/book3s_hv_rmhandlers.S | 52 + arch/powerpc/kvm/book3s_xics.c | 5 +- arch/powerpc/kvm/book3s_xive.c | 106 +- arch/powerpc/kvm/book3s_xive.h | 74 ++ arch/powerpc/kvm/book3s_xive_native.c | 1257 ++++++++++++++++++++++++ arch/powerpc/kvm/book3s_xive_native_template.c | 381 +++++++ arch/powerpc/kvm/powerpc.c | 52 +- arch/powerpc/platforms/powernv/opal-wrappers.S | 4 + arch/powerpc/sysdev/xive/native.c | 107 ++ arch/powerpc/sysdev/xive/spapr.c | 28 +- include/uapi/linux/kvm.h | 5 + virt/kvm/kvm_main.c | 40 + 24 files changed, 2585 insertions(+), 72 deletions(-) create mode 100644 arch/powerpc/kvm/book3s_hv_rm_xive_native.c create mode 100644 arch/powerpc/kvm/book3s_xive_native.c create mode 100644 arch/powerpc/kvm/book3s_xive_native_template.c -- 2.13.6