Re: [PATCH v2 05/11] KVM: arm64: Defer WFI emulation as a requested event

Sean Christopherson <seanjc@xxxxxxxxxx> · Thu, 30 Sep 2021 17:09:07 +0000

On Thu, Sep 30, 2021, Marc Zyngier wrote:
> On Thu, 23 Sep 2021 20:16:04 +0100, Oliver Upton <oupton@xxxxxxxxxx> wrote:
> > @@ -681,6 +687,9 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
> >  		if (kvm_check_request(KVM_REQ_SLEEP, vcpu))
> >  			kvm_vcpu_sleep(vcpu);
> >  
> > +		if (kvm_check_request(KVM_REQ_SUSPEND, vcpu))
> > +			kvm_vcpu_suspend(vcpu);
> > +

...

> > diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> > index 275a27368a04..5e5ef9ff4fba 100644
> > --- a/arch/arm64/kvm/handle_exit.c
> > +++ b/arch/arm64/kvm/handle_exit.c
> > @@ -95,8 +95,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu)
> >  	} else {
> >  		trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
> >  		vcpu->stat.wfi_exit_stat++;
> > -		kvm_vcpu_block(vcpu);
> > -		kvm_clear_request(KVM_REQ_UNHALT, vcpu);
> > +		kvm_make_request(KVM_REQ_SUSPEND, vcpu);
> >  	}
> >  
> >  	kvm_incr_pc(vcpu);
> 
> This is a change in behaviour. At the point where the blocking
> happens, PC will have already been incremented. I'd rather you don't
> do that. Instead, make the helper available and call into it directly,
> preserving the current semantics.

Is there architectural behavior that KVM can emulate?  E.g. if you were to probe a
physical CPU while it's waiting, would you observe the pre-WFI PC, or the post-WFI
PC?  Following arch behavior would be ideal because it eliminates subjectivity.
Regardless of the architectural behavior, changing KVM's behavior should be
done explicitly in a separate patch.

Irrespective of PC behavior, I would caution against using a request for handling
WFI.  Deferring the WFI opens up the possibility for all sorts of ordering
oddities, e.g. if KVM exits to userspace between here and check_vcpu_requests(),
then KVM can end up with a "spurious" pending KVM_REQ_SUSPEND if maniupaltes vCPU
state.  I highly doubt that userspace VMMs would actually do that, as it would
basically require a signal from userspace, but it's not impossible, and at the
very least the pending request is yet another thing to worry about in the future.

Unlike PSCI power-off, WFI isn't cross-vCPU, thus there's no hard requirement
for using a request.  And KVM_REQ_SLEEP also has an additional guard in that it
doesn't enter rcuwait if power_off (or pause) was cleared after the request was
made, e.g. if userspace stuffed vCPU state and set the vCPU RUNNABLE.

> It is also likely to clash with Sean's kvm_vcpu_block() rework, but we
> can work around that.

Ya.  Oliver, can you Cc me on future patches?  I'll try to keep my eyeballs on this
series.