> On 16 Apr 2019, at 14:02, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > On 12/04/19 22:18, Sean Christopherson wrote: >> To minimize the latency of timer interrupts as observed by the guest, >> KVM adjusts the values it programs into the host timers to account for >> the host's overhead of programming and handling the timer event. In >> the event that the adjustments are too aggressive, i.e. the timer fires >> earlier than the guest expects, KVM busy waits immediately prior to >> entering the guest. >> >> Currently, KVM manually converts the delay from nanoseconds to clock >> cycles. But, the conversion is done in the guest's time domain, while >> the delay occurs in the host's time domain, i.e. the delay may not be >> accurate and could wait too little or too long. Sidestep the headache >> of shifting time domains by delaying 1ns at a time and breaking the loop >> when the guest's desired timer delay has been satisfied. Because the >> advancement, which caps the delay to avoid unbounded busy waiting, is >> stored in nanoseconds, the current advancement time can simply be used >> as a loop terminator since we're delaying 1ns at a time (plus the few >> cycles of overhead for running the code). > > A 1 nanosecond advance (3-5 clock cycles) is even shorter than the time > taken to execute kvm_read_l1_tsc. I would just replace it with > cpu_relax(); I can do the change when applying. > > Paolo Paolo, there is also another approach Sean and I were discussing. I think it is more elegant. See previous comments in this thread. -Liran > >> Cc: Liran Alon <liran.alon@xxxxxxxxxx> >> Cc: Wanpeng Li <wanpengli@xxxxxxxxxxx> >> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> >> --- >> arch/x86/kvm/lapic.c | 13 +++++++------ >> 1 file changed, 7 insertions(+), 6 deletions(-) >> >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c >> index 92446cba9b24..e797e3145a8b 100644 >> --- a/arch/x86/kvm/lapic.c >> +++ b/arch/x86/kvm/lapic.c >> @@ -1486,7 +1486,8 @@ static bool lapic_timer_int_injected(struct kvm_vcpu *vcpu) >> void wait_lapic_expire(struct kvm_vcpu *vcpu) >> { >> struct kvm_lapic *apic = vcpu->arch.apic; >> - u64 guest_tsc, tsc_deadline, ns; >> + u32 timer_advance_ns = lapic_timer_advance_ns; >> + u64 guest_tsc, tmp_tsc, tsc_deadline, ns; >> >> if (!lapic_in_kernel(vcpu)) >> return; >> @@ -1499,13 +1500,13 @@ void wait_lapic_expire(struct kvm_vcpu *vcpu) >> >> tsc_deadline = apic->lapic_timer.expired_tscdeadline; >> apic->lapic_timer.expired_tscdeadline = 0; >> - guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc()); >> + tmp_tsc = guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc()); >> trace_kvm_wait_lapic_expire(vcpu->vcpu_id, guest_tsc - tsc_deadline); >> >> - /* __delay is delay_tsc whenever the hardware has TSC, thus always. */ >> - if (guest_tsc < tsc_deadline) >> - __delay(min(tsc_deadline - guest_tsc, >> - nsec_to_cycles(vcpu, lapic_timer_advance_ns))); >> + for (ns = 0; tmp_tsc < tsc_deadline && ns < timer_advance_ns; ns++) { >> + ndelay(1); >> + tmp_tsc = kvm_read_l1_tsc(vcpu, rdtsc()); >> + } >> >> if (!lapic_timer_advance_adjust_done) { >> /* too early */ >> >