On Mon, Apr 15, 2019 at 08:25:48PM +0300, Liran Alon wrote: > > > > On 15 Apr 2019, at 19:32, Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote: > > > > On Sun, Apr 14, 2019 at 03:15:41PM +0300, Liran Alon wrote: > >> > >> > >>> On 12 Apr 2019, at 23:18, Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote: > >>> > >>> Calling apic_timer_expired() is a nop when a timer interrupt is already > >>> pending, i.e. there's no need to call apic_timer_expired() when there's > >>> a pending interrupt and the hv_timer wants to pend its own interrupt. > >>> Separate the two flows to make the code more readable and to avoid an > >>> unnecessary function call and read to ktimer->pending. > >> > >> In case timer is not periodic and r==1, atomic_read(&ktimer->pending) is not executed. > >> > >>> > >>> Cc: Wanpeng Li <wanpengli@xxxxxxxxxxx> > >>> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > >>> --- > >>> arch/x86/kvm/lapic.c | 9 ++++++--- > >>> 1 file changed, 6 insertions(+), 3 deletions(-) > >>> > >>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > >>> index 1d649a2af04c..f0be6f148a47 100644 > >>> --- a/arch/x86/kvm/lapic.c > >>> +++ b/arch/x86/kvm/lapic.c > >>> @@ -1703,9 +1703,12 @@ static bool start_hv_timer(struct kvm_lapic *apic) > >>> * the window. For periodic timer, leave the hv timer running for > >>> * simplicity, and the deadline will be recomputed on the next vmexit. > >>> */ > >>> - if (!apic_lvtt_period(apic) && (r || atomic_read(&ktimer->pending))) { > >>> - if (r) > >>> - apic_timer_expired(apic); > >>> + if (!apic_lvtt_period(apic) && atomic_read(&ktimer->pending)) > >>> + return false; > >>> + > >>> + /* set_hv_timer() returns '1' when the timer has already expired. */ > >>> + if (r) { > >>> + apic_timer_expired(apic); > >>> return false; > >>> } > >>> > >>> -- > >>> 2.21.0 > >>> > >> > >> First, I think you should emphasise in commit message that you have actually > >> fixed a rare bug here. In case timer is periodic but given > >> ktimer->tscdeadline has already expired on host, we should call > >> apic_timer_expired(). > > > > Heh, I actually didn't even catch that bug, I was simply cleaning up the > > code because I had a hard time following the logic. > > LOL. So you can put me in the Reported-by tag :P Actually, thinking about this more, I believe the original behavior was correct, if poorly documented. More info below. > >> In addition, when start_hv_timer() returns false, restart_apic_timer() just > >> calls start_sw_timer() which use hrtimer instead of VMX preemption timer. > >> Therefore, it seems a bit ineffective to me for start_hv_timer() to return > >> false in case ktimer->pending or when ktimer->tscdeadline already expired. > >> Shouldn’t we return true in these cases? > > > > That also seemed weird to me. Again, I had a hell of a time following the > > intended logic and didn't want to break anything. AFAICT, the motivation > > for calling start_sw_timer() is to cancel the HV timer, and possibly to > > ensure start_sw_period() is called when necessary. > > I think the motivation is that if there is any reason why hardware > accelerated timer (i.e. VMX preemption timer), can't be used to emulate the > LAPIC timer, then utilise a software hrtimer based implementation instead. My comment was regarding why start_hv_timer() returns was when the hv_timer as already expired. > This does align with why we return false when (!kvm_x86_ops->set_hv_timer) or > (kvm_x86_ops->set_hv_timer() < 0). However, this doesn’t align in case we > have a (non-periodic timer and ktimer->pending) OR ktimer->tscdeadline > already expired OR (!ktimer->tscdeadline). > > In fact, note that start_sw_timer() early-exit when non-periodic timer and > ktimer->pending… Same is also true for start_sw_tscdeadline() early-exit when > (!ktimer->tscdeadline). > > > But the latter will be > > handled by virtue of checking "r" after apic_lvtt_period(), so this? > > > > if (r) { > > apic_timer_expired(apic); > > ktimer->hv_timer_in_use = false; > > return true; > > } > > I think I will just submit a patch to fix all the above examples I made as > this just seems wrong to me. Unless you find something I have missed. :P When the timer is periodic, we're relying on the timer handler to invoke advance_periodic_target_expiration() by way of kvm_lapic_expired_hv_timer(). That's why the original code only checks @r if apic_lvtt_period()==false, i.e. to actually trigger a VMX preemption timer VM-Exit. Note that the return from set_hv_timer() is essentially a hint, e.g. VMX is perfectly fine programming a preemption timer with a value of zero. I think Paolo's suggestion of moving the logic up into restart_apic_timer() is the way to go as it reduces the multiplexing down on start_hv_timer()'s return value.