On 21/06/2016, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: > The host timer which emulates the guest LAPIC TSC deadline > timer has its expiration diminished by lapic_timer_advance_ns > nanoseconds. Therefore if, at wait_lapic_expire, a difference > larger than lapic_timer_advance_ns is encountered, delay at most > lapic_timer_advance_ns. > > This fixes a problem where the guest can cause the host > to delay for large amounts of time. > > Reported-by: Alan Jenkins <alan.christopher.jenkins@xxxxxxxxx> > Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx> > > --- > > v2: s/max/min/ ! > > Index: kvm/arch/x86/kvm/lapic.c > =================================================================== > --- kvm.orig/arch/x86/kvm/lapic.c > +++ kvm/arch/x86/kvm/lapic.c > @@ -1164,7 +1164,8 @@ void wait_lapic_expire(struct kvm_vcpu * > > /* __delay is delay_tsc whenever the hardware has TSC, thus always. */ > if (guest_tsc < tsc_deadline) > - __delay(tsc_deadline - guest_tsc); > + __delay(min(tsc_deadline - guest_tsc, > + nsec_to_cycles(vcpu, lapic_timer_advance_ns))); > } > > static void start_apic_timer(struct kvm_lapic *apic) > Can we make sure this is sent to -stable? Since I tested this approach (I think complete with the same max/min correction :) before I started trying to boil the ocean, it looks good to me. I expect nsec_to_cycles() always rounds down. Any rounding up would be undesirable :) - but this way, even if host sets tsc_khz=1, all that can happen is the guest is woken 1 tick before the deadline is met. It wouldn't be a whole millisecond early in real time either, just as much as lapic_timer_advance_ns. And talking of scaling, it will cut off the other potential lockup as well where we currently fail to scale the guest delay down into host TSC ticks. I also suggested guests could observe early wakeup when sysadmins tune lapic_timer_advance_ns (i.e. it was reduced after the timer was set). But Paolo pointed out the nice way to fix that would be to make lapic_timer_advance_ns read-only (explicit requirement to `rmmod kvm` before you can try a different value). Thanks for this Alan -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html