On 08/12/14 12:58, Christoffer Dall wrote: > On Mon, Dec 08, 2014 at 12:04:53PM +0000, Marc Zyngier wrote: >> On 03/12/14 21:18, Christoffer Dall wrote: >>> When a vcpu calls SYSTEM_OFF or SYSTEM_RESET with PSCI v0.2, the vcpus >>> should really be turned off for the VM adhering to the suggestions in >>> the PSCI spec, and it's the sane thing to do. >>> >>> Also, clarify the behavior and expectations for exits to user space with >>> the KVM_EXIT_SYSTEM_EVENT case. >>> >>> Signed-off-by: Christoffer Dall <christoffer.dall@xxxxxxxxxx> >>> --- >>> Documentation/virtual/kvm/api.txt | 9 +++++++++ >>> arch/arm/kvm/psci.c | 19 +++++++++++++++++++ >>> arch/arm64/include/asm/kvm_host.h | 1 + >>> 3 files changed, 29 insertions(+) >>> >>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt >>> index 81f1b97..228f9cf 100644 >>> --- a/Documentation/virtual/kvm/api.txt >>> +++ b/Documentation/virtual/kvm/api.txt >>> @@ -2957,6 +2957,15 @@ HVC instruction based PSCI call from the vcpu. The 'type' field describes >>> the system-level event type. The 'flags' field describes architecture >>> specific flags for the system-level event. >>> >>> +Valid values for 'type' are: >>> + KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the >>> + VM. Userspace is not obliged to honour this, and if it does honour >>> + this does not need to destroy the VM synchronously (ie it may call >>> + KVM_RUN again before shutdown finally occurs). >>> + KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM. >>> + As with SHUTDOWN, userspace can choose to ignore the request, or >>> + to schedule the reset to occur in the future and may call KVM_RUN again. >>> + >>> /* Fix the size of the union. */ >>> char padding[256]; >>> }; >>> diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c >>> index 09cf377..ae0bb91 100644 >>> --- a/arch/arm/kvm/psci.c >>> +++ b/arch/arm/kvm/psci.c >>> @@ -15,6 +15,7 @@ >>> * along with this program. If not, see <http://www.gnu.org/licenses/>. >>> */ >>> >>> +#include <linux/preempt.h> >>> #include <linux/kvm_host.h> >>> #include <linux/wait.h> >>> >>> @@ -166,6 +167,24 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu) >>> >>> static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type) >>> { >>> + int i; >>> + struct kvm_vcpu *tmp; >>> + >>> + /* >>> + * The KVM ABI specifies that a system event exit may call KVM_RUN >>> + * again and may perform shutdown/reboot at a later time that when the >>> + * actual request is made. Since we are implementing PSCI and a >>> + * caller of PSCI reboot and shutdown expects that the system shuts >>> + * down or reboots immediately, let's make sure that VCPUs are not run >>> + * after this call is handled and before the VCPUs have been >>> + * re-initialized. >>> + */ >>> + kvm_for_each_vcpu(i, tmp, vcpu->kvm) >>> + tmp->arch.pause = true; >>> + preempt_disable(); >>> + force_vm_exit(cpu_all_mask); >>> + preempt_enable(); >>> + >> >> I'm slightly uneasy about this force_vm_exit, as this is something that >> is directly triggered by the guest. I suppose it is almost impossible to >> find out which CPUs we're actually using... >> > Ah, you mean we should only IPI the CPUs that are actually running a > VCPU belonging to this VM? > > I guess I could replace it with: > > kvm_for_each_vcpu(i, tmp, vcpu->kvm) { > tmp->arch.pause = true; > kvm_vcpu_kick(tmp); > } Ah, that's even simpler than I thought. Yeah, looks good to me. > > or a slightly more optimized "half-open-coded-kvm_vcpu_kick": > > me = get_cpu(); > kvm_for_each_vcpu(i, tmp, vcpu->kvm) { > tmp->arch.pause = true; > if (tmp->cpu != me && (unsigned)tmp->cpu < nr_cpu_ids && > cpu_online(tmp->cpu) && kvm_arch_vcpu_should_kick(tmp)) > smp_send_reschedule(tmp->cpu); > } > > which should save us waking up vcpu threads that are parked on > waitqueues. Not sure it's worth it, maybe it is for 100s of vcpu > systems? Probably not worth it at the moment. > Can we actually replace force_vm_exit() with the more optimized > open-coded version? That messes with VMID allocation so it really needs > a lot of testing though... VMID reallocation almost never occurs, and that's a system-wide event, not triggered by a guest. I'd rather not mess with that just yet. > Preferences? I think your first version is very nice, provided that it doesn't introduce any unforeseen regression. Thanks, M. -- Jazz is not dead. It just smells funny... -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html