On 4/30/19 12:11 PM, Paul Mackerras wrote: > On Thu, Apr 18, 2019 at 12:39:25PM +0200, Cédric Le Goater wrote: >> On the POWER9 processor, the XIVE interrupt controller can control >> interrupt sources using MMIOs to trigger events, to EOI or to turn off >> the sources. Priority management and interrupt acknowledgment is also >> controlled by MMIO in the CPU presenter sub-engine. >> >> PowerNV/baremetal Linux runs natively under XIVE but sPAPR guests need >> special support from the hypervisor to do the same. This is called the >> XIVE native exploitation mode and today, it can be activated under the >> PowerPC Hypervisor, pHyp. However, Linux/KVM lacks XIVE native support >> and still offers the old interrupt mode interface using a KVM device >> implementing the XICS hcalls over XIVE. >> >> The following series is proposal to add the same support under KVM. >> >> A new KVM device is introduced for the XIVE native exploitation >> mode. It reuses most of the XICS-over-XIVE glue implementation >> structures which are internal to KVM but has a completely different >> interface. A set of KVM device ioctls provide support for the >> hypervisor calls, all handled in QEMU, to configure the sources and >> the event queues. From there, all interrupt control is transferred to >> the guest which can use MMIOs. >> >> These MMIO regions (ESB and TIMA) are exposed to guests in QEMU, >> similarly to VFIO, and the associated VMAs are populated dynamically >> with the appropriate pages using a fault handler. These are now >> implemented using mmap()s of the KVM device fd. >> >> Migration has its own specific needs regarding memory. The patchset >> provides a specific control to quiesce XIVE before capturing the >> memory. The save and restore of the internal state is based on the >> same ioctls used for the hcalls. >> >> On a POWER9 sPAPR machine, the Client Architecture Support (CAS) >> negotiation process determines whether the guest operates with a >> interrupt controller using the XICS legacy model, as found on POWER8, >> or in XIVE exploitation mode. Which means that the KVM interrupt >> device should be created at run-time, after the machine has started. >> This requires extra support from KVM to destroy KVM devices. It is >> introduced at the end of the patchset and requires some attention. >> >> This is based on Linux 5.1-rc5 and is a candidate for 5.2. The OPAL >> patches have been merged now. > > Thanks, patch series applied to my kvm-ppc-next tree. I added two > patches of mine on top to make sure we exclude other execution paths > in the device release method, and to clear the escalation interrupt > hardware pointers on release. I also modified your last patch to free > the xive structures in book3s.c rather than powerpc.c in order to fix > compilation for Book E configs. OK. I have one minor cleanup removing bogus checks in the release method of the KVM device. Thanks, C.