On Thu, Sep 30, 2021, Marc Zyngier wrote: > On Thu, 23 Sep 2021 20:16:04 +0100, Oliver Upton <oupton@xxxxxxxxxx> wrote: > > @@ -681,6 +687,9 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu) > > if (kvm_check_request(KVM_REQ_SLEEP, vcpu)) > > kvm_vcpu_sleep(vcpu); > > > > + if (kvm_check_request(KVM_REQ_SUSPEND, vcpu)) > > + kvm_vcpu_suspend(vcpu); > > + ... > > diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c > > index 275a27368a04..5e5ef9ff4fba 100644 > > --- a/arch/arm64/kvm/handle_exit.c > > +++ b/arch/arm64/kvm/handle_exit.c > > @@ -95,8 +95,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu) > > } else { > > trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false); > > vcpu->stat.wfi_exit_stat++; > > - kvm_vcpu_block(vcpu); > > - kvm_clear_request(KVM_REQ_UNHALT, vcpu); > > + kvm_make_request(KVM_REQ_SUSPEND, vcpu); > > } > > > > kvm_incr_pc(vcpu); > > This is a change in behaviour. At the point where the blocking > happens, PC will have already been incremented. I'd rather you don't > do that. Instead, make the helper available and call into it directly, > preserving the current semantics. Is there architectural behavior that KVM can emulate? E.g. if you were to probe a physical CPU while it's waiting, would you observe the pre-WFI PC, or the post-WFI PC? Following arch behavior would be ideal because it eliminates subjectivity. Regardless of the architectural behavior, changing KVM's behavior should be done explicitly in a separate patch. Irrespective of PC behavior, I would caution against using a request for handling WFI. Deferring the WFI opens up the possibility for all sorts of ordering oddities, e.g. if KVM exits to userspace between here and check_vcpu_requests(), then KVM can end up with a "spurious" pending KVM_REQ_SUSPEND if maniupaltes vCPU state. I highly doubt that userspace VMMs would actually do that, as it would basically require a signal from userspace, but it's not impossible, and at the very least the pending request is yet another thing to worry about in the future. Unlike PSCI power-off, WFI isn't cross-vCPU, thus there's no hard requirement for using a request. And KVM_REQ_SLEEP also has an additional guard in that it doesn't enter rcuwait if power_off (or pause) was cleared after the request was made, e.g. if userspace stuffed vCPU state and set the vCPU RUNNABLE. > It is also likely to clash with Sean's kvm_vcpu_block() rework, but we > can work around that. Ya. Oliver, can you Cc me on future patches? I'll try to keep my eyeballs on this series.