start_hv_timer() currently returns a bool, with %false interpreted as "failure" and %true as "success". But the reality is far more complex, as the %false is also returned when programming was successful, but must be canceled because there was already a pending timer interrupt. It can also return %false when programming was successful but the timer was marked expired because the deadline already passed, in which case the hv timer must also be canceled to avoid double injection. Functionally, the existing code is correct in the sense that it doesn't doing anything visibily wrong, e.g. generate spurious interrupts or miss an interrupt. But it's extremely confusing and inefficient, e.g. there are multiple extraneous calls to apic_timer_expired() that effectively get dropped due to @timer_pending being %true. Refactor start_hv_timer() to return an 'int' and move the handling of the various combinations of success vs. expiration vs. periodic up a level to restart_apic_timer(), which eliminates the needs to overload the meaning of %false. Cc: Wanpeng Li <wanpengli@xxxxxxxxxxx> Suggested-by: Liran Alon <liran.alon@xxxxxxxxxx> Suggested-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> --- arch/x86/kvm/lapic.c | 59 ++++++++++++++++++++++++++++++-------------- 1 file changed, 40 insertions(+), 19 deletions(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index fa2dbca5ad6a..b867569107d5 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1678,41 +1678,35 @@ static void cancel_hv_timer(struct kvm_lapic *apic) apic->lapic_timer.hv_timer_in_use = false; } -static bool start_hv_timer(struct kvm_lapic *apic) +/* + * Return: + * Negative errno on failure, + * '1' on success and the timer's deadline has passed, + * '0' on success and the timer's deadline is some time in the future. + */ +static int start_hv_timer(struct kvm_lapic *apic) { struct kvm_timer *ktimer = &apic->lapic_timer; int r; WARN_ON(preemptible()); if (!kvm_x86_ops->set_hv_timer) - return false; + return -ENOENT; if (!apic_lvtt_period(apic) && atomic_read(&ktimer->pending)) - return false; + return -EBUSY; if (!ktimer->tscdeadline) - return false; + return -EINVAL; r = kvm_x86_ops->set_hv_timer(apic->vcpu, ktimer->tscdeadline); if (r < 0) - return false; + return r; ktimer->hv_timer_in_use = true; hrtimer_cancel(&ktimer->timer); - /* - * Also recheck ktimer->pending, in case the sw timer triggered in - * the window. For periodic timer, leave the hv timer running for - * simplicity, and the deadline will be recomputed on the next vmexit. - */ - if (!apic_lvtt_period(apic) && (r || atomic_read(&ktimer->pending))) { - if (r) - apic_timer_expired(apic); - return false; - } - - trace_kvm_hv_timer_state(apic->vcpu->vcpu_id, true); - return true; + return r; } static void start_sw_timer(struct kvm_lapic *apic) @@ -1734,9 +1728,36 @@ static void start_sw_timer(struct kvm_lapic *apic) static void restart_apic_timer(struct kvm_lapic *apic) { + int r; + preempt_disable(); - if (!start_hv_timer(apic)) + r = start_hv_timer(apic); + if (r < 0) { start_sw_timer(apic); + } else if (!apic_lvtt_period(apic)) { + /* + * For a periodic timer, leave the timer running for simplicity + * so that the deadline will be recomputed on the next VM-Exit. + */ + if (atomic_read(&apic->lapic_timer.pending)) { + /* + * Cancel the hv timer if the sw timer fired while the + * hv timer was being programmed. + */ + cancel_hv_timer(apic); + } else if (r) { + /* + * start_hv_timer() returns '1' if the timer's deadline + * has already passed. Again, go through with the full + * programming in the periodic case as the VM-Exit is + * needed to recompute the periodic timer deadline. + */ + apic_timer_expired(apic); + cancel_hv_timer(apic); + } + } + trace_kvm_hv_timer_state(apic->vcpu->vcpu_id, + apic->lapic_timer.hv_timer_in_use); preempt_enable(); } -- 2.21.0