On 12/10/2014 07:07 PM, Marcelo Tosatti wrote: > On Thu, Dec 11, 2014 at 12:37:57AM +0100, Paolo Bonzini wrote: >> >> >> On 10/12/2014 21:57, Marcelo Tosatti wrote: >>> For the hrtimer which emulates the tscdeadline timer in the guest, >>> add an option to advance expiration, and busy spin on VM-entry waiting >>> for the actual expiration time to elapse. >>> >>> This allows achieving low latencies in cyclictest (or any scenario >>> which requires strict timing regarding timer expiration). >>> >>> Reduces cyclictest avg latency by 50%. >>> >>> Note: this option requires tuning to find the appropriate value >>> for a particular hardware/guest combination. One method is to measure the >>> average delay between apic_timer_fn and VM-entry. >>> Another method is to start with 1000ns, and increase the value >>> in say 500ns increments until avg cyclictest numbers stop decreasing. >> >> What values are you using in practice for the parameter? > > 7us. It takes 7us to get from TSC deadline expiration to the *start* of vmresume? That seems rather extreme. Is it possible that almost all of that latency is from deadline expiration to C-state exit? If so, can we teach the timer code to wake up early to account for that? We're supposed to know our idle exit latency these days. --Andy > >>> Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx> >>> >>> Index: kvm/arch/x86/kvm/lapic.c >>> =================================================================== >>> --- kvm.orig/arch/x86/kvm/lapic.c >>> +++ kvm/arch/x86/kvm/lapic.c >>> @@ -33,6 +33,7 @@ >>> #include <asm/page.h> >>> #include <asm/current.h> >>> #include <asm/apicdef.h> >>> +#include <asm/delay.h> >>> #include <linux/atomic.h> >>> #include <linux/jump_label.h> >>> #include "kvm_cache_regs.h" >>> @@ -1073,6 +1074,7 @@ static void apic_timer_expired(struct kv >>> { >>> struct kvm_vcpu *vcpu = apic->vcpu; >>> wait_queue_head_t *q = &vcpu->wq; >>> + struct kvm_timer *ktimer = &apic->lapic_timer; >>> >>> /* >>> * Note: KVM_REQ_PENDING_TIMER is implicitly checked in >>> @@ -1087,11 +1089,58 @@ static void apic_timer_expired(struct kv >>> >>> if (waitqueue_active(q)) >>> wake_up_interruptible(q); >>> + >>> + if (ktimer->timer_mode_mask == APIC_LVT_TIMER_TSCDEADLINE) >>> + ktimer->expired_tscdeadline = ktimer->tscdeadline; >>> +} >>> + >>> +static bool lapic_timer_int_injected(struct kvm_vcpu *vcpu) >>> +{ >>> + struct kvm_lapic *apic = vcpu->arch.apic; >>> + u32 reg = kvm_apic_get_reg(apic, APIC_LVTT); >>> + >>> + if (kvm_apic_hw_enabled(apic)) { >>> + int vec = reg & APIC_VECTOR_MASK; >>> + >>> + if (kvm_x86_ops->test_posted_interrupt) >>> + return kvm_x86_ops->test_posted_interrupt(vcpu, vec); >>> + else { >>> + if (apic_test_vector(vec, apic->regs + APIC_ISR)) >>> + return true; >>> + } >> >> One branch here is testing IRR, the other is testing ISR. I think >> testing ISR is right; on APICv, the above test will cause a busy wait >> during a higher-priority task (or during an interrupt service routine >> for the timer itself), just because the timer interrupt was delivered. > > Yes. > >> So, on APICv, if the interrupt is in PIR but it has bits 7:4 <= >> PPR[7:4], you have a problem. :( There is no APICv hook that lets you >> get a vmexit when the PPR becomes low enough. > > Well, you simply exit earlier and busy spin for VM-exit > time. > > For Linux guests, there is no problem. > >>> + } >>> + return false; >>> +} >>> + >>> +void wait_lapic_expire(struct kvm_vcpu *vcpu) >>> +{ >>> + struct kvm_lapic *apic = vcpu->arch.apic; >>> + u64 guest_tsc, tsc_deadline; >>> + >>> + if (!kvm_vcpu_has_lapic(vcpu)) >>> + return; >>> + >>> + if (!apic_lvtt_tscdeadline(apic)) >>> + return; >> >> This test is wrong, I think. You need to check whether the timer >> interrupt was a TSC deadline interrupt. Instead, you are checking >> whether the current mode is TSC-deadline. This can be different if the >> interrupt could not be delivered immediately after it was received. >> This is easy to fix: replace the first two tests with >> "apic->lapic_timer.expired_tscdeadline != 0" and... > > Yes. > >>> + if (!lapic_timer_int_injected(vcpu)) >>> + return; >>> + tsc_deadline = apic->lapic_timer.expired_tscdeadline; >> >> ... set apic->lapic_timer.expired_tscdeadline to 0 here. >> >> But I'm not sure how to solve the above problem with APICv. That's a >> pity. Knowing what values you use in practice for the parameter, would >> also make it easier to understand the problem. Please report that >> together with the graphs produced by the unit test you added. > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html