Hello David, On Mon, Apr 06, 2020 at 07:58:19PM +1000, David Gibson wrote: > On Fri, Apr 03, 2020 at 03:01:03PM +0530, Gautham R Shenoy wrote: > > On Fri, Apr 03, 2020 at 12:20:26PM +1000, Nicholas Piggin wrote: > > > Gautham R. Shenoy's on March 31, 2020 10:10 pm: > > > > From: "Gautham R. Shenoy" <ego@xxxxxxxxxxxxxxxxxx> > > > > > > > > ISA v3.0 allows the guest to execute a stop instruction. For this, the > > > > PSSCR[ESL|EC] bits need to be cleared by the hypervisor before > > > > scheduling in the guest vCPU. > > > > > > > > Currently we always schedule in a vCPU with PSSCR[ESL|EC] bits > > > > set. This patch changes the behaviour to enter the guest with > > > > PSSCR[ESL|EC] bits cleared. This is a RFC patch where we > > > > unconditionally clear these bits. Ideally this should be done > > > > conditionally on platforms where the guest stop instruction has no > > > > Bugs (starting POWER9 DD2.3). > > > > > > How will guests know that they can use this facility safely after your > > > series? You need both DD2.3 and a patched KVM. > > > > > > Yes, this is something that isn't addressed in this series (mentioned > > in the cover letter), which is a POC demonstrating that the stop0lite > > state in guest works. > > > > However, to answer your question, this is the scheme that I had in > > mind : > > > > OPAL: > > On Procs >= DD2.3 : we publish a dt-cpu-feature "idle-stop-guest" > > > > Hypervisor Kernel: > > 1. If "idle-stop-guest" dt-cpu-feature is discovered, then > > we set bool enable_guest_stop = true; > > > > 2. During KVM guest entry, clear PSSCR[ESL|EC] iff > > enable_guest_stop == true. > > > > 3. In kvm_vm_ioctl_check_extension(), for a new capability > > KVM_CAP_STOP, return true iff enable_guest_top == true. > > > > QEMU: > > Check with the hypervisor if KVM_CAP_STOP is present. If so, > > indicate the presence to the guest via device tree. > > Nack. Presenting different capabilities to the guest depending on > host capabilities (rather than explicit options) is never ok. It > means that depending on the system you start on you may or may not be > able to migrate to other systems that you're supposed to be able to, I agree that blocking migration for the unavailability of this feature is not desirable. Could you point me to some other capabilities in KVM which have been implemented via explicit options? The ISA 3.0 allows the guest to execute the "stop" instruction. If the Hypervisor hasn't cleared the PSSCR[ESL|EC] then, guest executing the "stop" instruction in the causes a Hypervisor Facility Unavailable Exception, thus giving the hypervisor a chance to emulate the instruction. However, in the current code, when the hypervisor receives this exception, it sends a PROGKILL to the guest which results in crashing the guest. Patch 1 of this series emulates wakeup from the "stop" instruction. Would the following scheme be ok? OPAL: On Procs >= DD2.3 : we publish a dt-cpu-feature "idle-stop-guest" Hypervisor Kernel: If "idle-stop-guest" dt feature is available, then, before entering the guest, the hypervisor clears the PSSCR[EC|ESL] bits allowing the guest to safely execute stop instruction. If "idle-stop-guest" dt feature is not available, then, the Hypervisor sets the PSSCR[ESL|EC] bits, thereby causing a guest "stop" instruction execution to trap back into the hypervisor. We then emulate a wakeup from the stop instruction (Patch 1 of this series). Guest Kernel: If (cpu_has_feature(CPU_FTR_ARCH_300)) only then use the stop0lite cpuidle state. This allows us to migrate the KVM guest across any POWER9 Hypervisor. The minimal addition that the Hypervisor would need is Patch 1 of this series. -- Thanks and Regards gautham.