Re: [PATCH] KVM: x86: Avoid busy loops over uninjectable pending APIC timers

Gleb Natapov <gleb@xxxxxxxxxx> · Sun, 24 Mar 2013 21:03:03 +0200



On Sun, Mar 24, 2013 at 10:45:53AM +0000, Zhang, Yang Z wrote:
> Gleb Natapov wrote on 2013-03-22:
> > On Fri, Mar 22, 2013 at 07:43:03AM -0300, Marcelo Tosatti wrote:
> >> On Fri, Mar 22, 2013 at 08:53:15AM +0200, Gleb Natapov wrote:
> >>> On Thu, Mar 21, 2013 at 08:06:41PM -0300, Marcelo Tosatti wrote:
> >>>> On Thu, Mar 21, 2013 at 11:13:39PM +0200, Gleb Natapov wrote:
> >>>>> On Thu, Mar 21, 2013 at 05:51:50PM -0300, Marcelo Tosatti wrote:
> >>>>>>>>> But current PI patches do break them, thats my point. So we
> >>>>>>>>> either need to revise them again, or drop LAPIC timer
> >>>>>>>>> reinjection. Making apic_accept_irq semantics "it returns
> >>>>>>>>> coalescing info, but only sometimes" is dubious though.
> >>>>>>>> We may rollback to the initial idea: test both irr and pir to get
> > coalescing info. In this case, inject LAPIC timer always in vcpu context. So
> > apic_accept_irq() will return right coalescing info.
> >>>>>>>> Also, we need to add comments to tell caller, apic_accept_irq()
> >>>>>>>> can ensure the return value is correct only when caller is in
> >>>>>>>> target vcpu context.
> >>>>>>>> 
> >>>>>>> We cannot touch irr while vcpu is in non-root operation, so we
> >>>>>>> will have to pass flag to apic_accept_irq() to let it know that it
> >>>>>>> is called synchronously. While all this is possible I want to know
> >>>>>>> which guests exactly will we break if we will not track interrupt
> >>>>>>> coalescing for lapic timer. If only 2.0 smp kernels will break we
> >>>>>>> can probably drop it.
> >>>>>> 
> >>>>>> RHEL4 / RHEL5 guests.
> >>>>> RHEL5 has kvmclock no? We should not break RHEL4 though.
> >>>> 
> >>>> kvmclock provides no timer interrupt... either LAPIC or PIT must be used
> >>>> with kvmclock.
> >>> I am confused now. If LAPIC is not used for wallclock time keeping, but
> >>> only for scheduling the reinjection is actually harmful. Reinjecting the
> >>> interrupt will cause needles task rescheduling. So the question is if
> >>> there is a Linux kernel that uses LAPIC for wallclock time keeping and
> >>> relies on accurate number of injected interrupts to not time drift.
> >> 
> >> See 4acd47cfea9c18134e0cbf915780892ef0ff433a on RHEL5, RHEL5 kernels
> >> before that commit did not reinject.  Which means that all non-RHEL
> >> Linux guests based on that upstream code also suffer from the same
> >> problem.
> >> 
> > The commit actually fixes guest, not host. The existence of the commit
> > also means that LAPIC timer reinjection does not solve the problem and
> > all guests without this commit will suffer from the bug regardless of
> > what we will decide to do here. Without LAPIC timer reinfection the
> > effect of the bug will be much more visible and long lasting though.
> > 
> >> Also any other algorithm which uses LAPIC timers and compare that with
> >> other clocks (such as NMI watchdog) are potentially vulnerable.
> > They are with or without timer reinjection as commit you pointed to
> > shows.
> > 
> >> 
> >> Can drop it, and then wait until someone complains (if so).
> >> 
> > Yes, tough decision to make. All the complains will be guest bugs which
> > can be hit without reinjection too, but with less probability. Why we so
> > keen on keeping RTC reinject is that the guests that depends on it
> > cannot be fixed.
> > 
> >>> Knowing that Linux tend to disable interrupt it is likely that it tries
> >>> to detect and compensate for missing interrupt.
> >> 
> >> As said above, any algorithm which compares LAPIC timer interrupt with
> >> another clock is vulnerable.
> Any conclusion? 
> 
Lets not check for coalescing in PI patches for now.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html