On Wed, Apr 17, 2019 at 02:57:55PM +0200, Paolo Bonzini wrote: > On 16/04/19 22:32, Sean Christopherson wrote: > > To minimize the latency of timer interrupts as observed by the guest, > > KVM adjusts the values it programs into the host timers to account for > > the host's overhead of programming and handling the timer event. Now > > that the timer advancement is automatically tuned during runtime, it's > > effectively unbounded by default, e.g. if KVM is running as L1 the > > advancement can measure in hundreds of milliseconds. > > > > Place a somewhat arbitrary hard cap of 5000ns on the auto-calculated > > advancement, as large advancements can break reasonable assumptions of > > the guest, e.g. that a timer configured to fire after 1ms won't arrive > > on the next instruction. Although KVM busy waits to mitigate the timer > > event arriving too early, complications can arise when shifting the > > interrupt too far, e.g. vmx.flat/interrupt in kvm-unit-tests will fail > > when its "host" exits on interrupts (because the INTR is injected before > > the gets executes STI+HLT). Arguably the unit test is "broken" in the > > sense that delaying the timer interrupt by 1ms doesn't technically > > guarantee the interrupt will arrive after STI+HLT, but it's a reasonable > > assumption that KVM should support. > > > > Furthermore, an unbounded advancement also effectively unbounds the time > > spent busy waiting, e.g. if the guest programs a timer with a very large > > delay. > > > > Arguably the advancement logic could simply be disabled when running as > > L1, but KVM needs to bound the advancement time regardless, e.g. if the > > TSC is unstable and the calculations get wildly out of whack. And > > allowing the advancement when running as L1 is a good stress test of > > sorts for the logic. > > > > Cc: Liran Alon <liran.alon@xxxxxxxxxx> > > Cc: Wanpeng Li <wanpengli@xxxxxxxxxxx> > > Fixes: 3b8a5df6c4dc6 ("KVM: LAPIC: Tune lapic_timer_advance_ns automatically") > > Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > > --- > > arch/x86/kvm/lapic.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > > index 9bf70cf84564..92446cba9b24 100644 > > --- a/arch/x86/kvm/lapic.c > > +++ b/arch/x86/kvm/lapic.c > > @@ -74,6 +74,7 @@ static bool lapic_timer_advance_adjust_done = false; > > #define LAPIC_TIMER_ADVANCE_ADJUST_DONE 100 > > /* step-by-step approximation to mitigate fluctuation */ > > #define LAPIC_TIMER_ADVANCE_ADJUST_STEP 8 > > +#define LAPIC_TIMER_ADVANCE_MAX_NS 5000 > > > > static inline int apic_test_vector(int vec, void *bitmap) > > { > > @@ -1522,6 +1523,10 @@ void wait_lapic_expire(struct kvm_vcpu *vcpu) > > } > > if (abs(guest_tsc - tsc_deadline) < LAPIC_TIMER_ADVANCE_ADJUST_DONE) > > lapic_timer_advance_adjust_done = true; > > + if (unlikely(lapic_timer_advance_ns > LAPIC_TIMER_ADVANCE_MAX_NS)) { > > + lapic_timer_advance_ns = LAPIC_TIMER_ADVANCE_MAX_NS; > > + lapic_timer_advance_adjust_done = true; > > + } > > I would treat this case as "advancing the timer has failed miserably" > and reset lapic_timer_advance_ns to 0. As in, disable the feature, correct? That works for me.