On Mon, Dec 08, 2014 at 01:19:15PM +0000, Marc Zyngier wrote: > On 08/12/14 12:58, Christoffer Dall wrote: > > On Mon, Dec 08, 2014 at 12:04:53PM +0000, Marc Zyngier wrote: > >> On 03/12/14 21:18, Christoffer Dall wrote: > >>> When a vcpu calls SYSTEM_OFF or SYSTEM_RESET with PSCI v0.2, the vcpus > >>> should really be turned off for the VM adhering to the suggestions in > >>> the PSCI spec, and it's the sane thing to do. > >>> > >>> Also, clarify the behavior and expectations for exits to user space with > >>> the KVM_EXIT_SYSTEM_EVENT case. > >>> > >>> Signed-off-by: Christoffer Dall <christoffer.dall@xxxxxxxxxx> > >>> --- > >>> Documentation/virtual/kvm/api.txt | 9 +++++++++ > >>> arch/arm/kvm/psci.c | 19 +++++++++++++++++++ > >>> arch/arm64/include/asm/kvm_host.h | 1 + > >>> 3 files changed, 29 insertions(+) > >>> > >>> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt > >>> index 81f1b97..228f9cf 100644 > >>> --- a/Documentation/virtual/kvm/api.txt > >>> +++ b/Documentation/virtual/kvm/api.txt > >>> @@ -2957,6 +2957,15 @@ HVC instruction based PSCI call from the vcpu. The 'type' field describes > >>> the system-level event type. The 'flags' field describes architecture > >>> specific flags for the system-level event. > >>> > >>> +Valid values for 'type' are: > >>> + KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the > >>> + VM. Userspace is not obliged to honour this, and if it does honour > >>> + this does not need to destroy the VM synchronously (ie it may call > >>> + KVM_RUN again before shutdown finally occurs). > >>> + KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM. > >>> + As with SHUTDOWN, userspace can choose to ignore the request, or > >>> + to schedule the reset to occur in the future and may call KVM_RUN again. > >>> + > >>> /* Fix the size of the union. */ > >>> char padding[256]; > >>> }; > >>> diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c > >>> index 09cf377..ae0bb91 100644 > >>> --- a/arch/arm/kvm/psci.c > >>> +++ b/arch/arm/kvm/psci.c > >>> @@ -15,6 +15,7 @@ > >>> * along with this program. If not, see <http://www.gnu.org/licenses/>. > >>> */ > >>> > >>> +#include <linux/preempt.h> > >>> #include <linux/kvm_host.h> > >>> #include <linux/wait.h> > >>> > >>> @@ -166,6 +167,24 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu) > >>> > >>> static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type) > >>> { > >>> + int i; > >>> + struct kvm_vcpu *tmp; > >>> + > >>> + /* > >>> + * The KVM ABI specifies that a system event exit may call KVM_RUN > >>> + * again and may perform shutdown/reboot at a later time that when the > >>> + * actual request is made. Since we are implementing PSCI and a > >>> + * caller of PSCI reboot and shutdown expects that the system shuts > >>> + * down or reboots immediately, let's make sure that VCPUs are not run > >>> + * after this call is handled and before the VCPUs have been > >>> + * re-initialized. > >>> + */ > >>> + kvm_for_each_vcpu(i, tmp, vcpu->kvm) > >>> + tmp->arch.pause = true; > >>> + preempt_disable(); > >>> + force_vm_exit(cpu_all_mask); > >>> + preempt_enable(); > >>> + > >> > >> I'm slightly uneasy about this force_vm_exit, as this is something that > >> is directly triggered by the guest. I suppose it is almost impossible to > >> find out which CPUs we're actually using... > >> > > Ah, you mean we should only IPI the CPUs that are actually running a > > VCPU belonging to this VM? > > > > I guess I could replace it with: > > > > kvm_for_each_vcpu(i, tmp, vcpu->kvm) { > > tmp->arch.pause = true; > > kvm_vcpu_kick(tmp); > > } > > Ah, that's even simpler than I thought. Yeah, looks good to me. > > > > > or a slightly more optimized "half-open-coded-kvm_vcpu_kick": > > > > me = get_cpu(); > > kvm_for_each_vcpu(i, tmp, vcpu->kvm) { > > tmp->arch.pause = true; > > if (tmp->cpu != me && (unsigned)tmp->cpu < nr_cpu_ids && > > cpu_online(tmp->cpu) && kvm_arch_vcpu_should_kick(tmp)) > > smp_send_reschedule(tmp->cpu); > > } > > > > which should save us waking up vcpu threads that are parked on > > waitqueues. Not sure it's worth it, maybe it is for 100s of vcpu > > systems? > > Probably not worth it at the moment. > > > Can we actually replace force_vm_exit() with the more optimized > > open-coded version? That messes with VMID allocation so it really needs > > a lot of testing though... > > VMID reallocation almost never occurs, and that's a system-wide event, > not triggered by a guest. I'd rather not mess with that just yet. > > > Preferences? > > I think your first version is very nice, provided that it doesn't > introduce any unforeseen regression. > ok, will respin with option #1. Thanks, -Christoffer -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html