Il 08/10/2014 12:06, Radim Krčmář ha scritto: > KVM: x86: fix deadline tsc interrupt injection > > The check in kvm_set_lapic_tscdeadline_msr() was trying to prevent a > situation where we lose a pending deadline timer in a MSR write. > Losing it is fine, because it effectively occurs before the timer fired, > so we should be able to cancel or postpone it. > > Another problem comes from interaction with QEMU, or other userspace > that can set deadline MSR without a good reason, when timer is already > pending: one guest's deadline request results in more than one > interrupt because one is injected immediately on MSR write from > userspace and one through hrtimer later. > > The solution is to remove the injection when replacing a pending timer > and to improve the usual QEMU path, we inject without a hrtimer when the > deadline has already passed. > > Signed-off-by: Radim Krčmář <rkrcmar@xxxxxxxxxx> > Reported-by: Nadav Amit <namit@xxxxxxxxxxxxxxxxx> > --- > arch/x86/kvm/lapic.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index b8345dd..51428dd 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -1096,9 +1096,12 @@ static void start_apic_timer(struct kvm_lapic *apic) > if (likely(tscdeadline > guest_tsc)) { > ns = (tscdeadline - guest_tsc) * 1000000ULL; > do_div(ns, this_tsc_khz); > + hrtimer_start(&apic->lapic_timer.timer, > + ktime_add_ns(now, ns), HRTIMER_MODE_ABS); > + } else { > + atomic_inc(&ktimer->pending); > + kvm_make_request(KVM_REQ_PENDING_TIMER, vcpu); > } > - hrtimer_start(&apic->lapic_timer.timer, > - ktime_add_ns(now, ns), HRTIMER_MODE_ABS); > > local_irq_restore(flags); > } > @@ -1355,9 +1358,6 @@ void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu, u64 data) > return; > > hrtimer_cancel(&apic->lapic_timer.timer); > - /* Inject here so clearing tscdeadline won't override new value */ > - if (apic_has_pending_timer(vcpu)) > - kvm_inject_apic_timer_irqs(vcpu); > apic->lapic_timer.tscdeadline = data; > start_apic_timer(apic); > } Radim, the patch looks good but please extract this code: /* * There is a race window between reading and incrementing, but we do * not care about potentially losing timer events in the !reinject * case anyway. Note: KVM_REQ_PENDING_TIMER is implicitly checked * in vcpu_enter_guest. */ if (!atomic_read(&ktimer->pending)) { atomic_inc(&ktimer->pending); /* FIXME: this code should not know anything about vcpus */ kvm_make_request(KVM_REQ_PENDING_TIMER, vcpu); } if (waitqueue_active(q)) wake_up_interruptible(q); to a new "static void apic_timer_expired(struct kvm_lapic *apic)" function, and call it from both apic_timer_fn and start_apic_timer. Also, we should not need to do wake_up_interruptible() unless we have changed ktimer->pending from zero to non-zero. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html