On Wed, Apr 17, 2019 at 02:56:53PM +0200, Paolo Bonzini wrote: > On 16/04/19 22:32, Sean Christopherson wrote: > > + /* ndelay uses delay_tsc whenever the hardware has TSC, thus always. */ > > + if (guest_tsc < tsc_deadline) { > > + ns = (tsc_deadline - guest_tsc) * 1000000ULL; > > + do_div(ns, vcpu->arch.virtual_tsc_khz); > > + ndelay(min_t(u32, ns, lapic_timer_advance_ns)); > > + } > > > > This min_t would allow the timer to expire in advance of the indicated > deadline. If this happens, wouldn't it be a bug elsewhere? Maybe? In this exact variation of the code, no, since userspace can modify lapic_timer_advance_ns while the vCPU is running. Obviously that case goes away when lapic_timer_advance_ns is no longer directly consumed by vCPUS. Originally, the check was added because lockup occurred if the guest's TSC was sent backwards[1]. The discussion surrounding the bugfix patch[2] is probably more interesting/amusing/relevant. We're basically rehashing that discussion. :-) Anyways, capping at a KVM controlled value is prudent since not doing so can hang the host if something does go wrong. As for the __delay() vs ndelay() discussion, I'll try to dig more into how/why I saw the timer fire with an incorrect TSC, i.e. why ndelay() "works" but __delay() does not. https://bugzilla.redhat.com/show_bug.cgi?id=1337667 https://patchwork.kernel.org/patch/9130687/