RE: [PATCH] KVM: x86: Avoid busy loops over uninjectable pending APIC timers

"Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx> · Sun, 24 Mar 2013 10:45:53 +0000



Gleb Natapov wrote on 2013-03-22:
> On Fri, Mar 22, 2013 at 07:43:03AM -0300, Marcelo Tosatti wrote:
>> On Fri, Mar 22, 2013 at 08:53:15AM +0200, Gleb Natapov wrote:
>>> On Thu, Mar 21, 2013 at 08:06:41PM -0300, Marcelo Tosatti wrote:
>>>> On Thu, Mar 21, 2013 at 11:13:39PM +0200, Gleb Natapov wrote:
>>>>> On Thu, Mar 21, 2013 at 05:51:50PM -0300, Marcelo Tosatti wrote:
>>>>>>>>> But current PI patches do break them, thats my point. So we
>>>>>>>>> either need to revise them again, or drop LAPIC timer
>>>>>>>>> reinjection. Making apic_accept_irq semantics "it returns
>>>>>>>>> coalescing info, but only sometimes" is dubious though.
>>>>>>>> We may rollback to the initial idea: test both irr and pir to get
> coalescing info. In this case, inject LAPIC timer always in vcpu context. So
> apic_accept_irq() will return right coalescing info.
>>>>>>>> Also, we need to add comments to tell caller, apic_accept_irq()
>>>>>>>> can ensure the return value is correct only when caller is in
>>>>>>>> target vcpu context.
>>>>>>>> 
>>>>>>> We cannot touch irr while vcpu is in non-root operation, so we
>>>>>>> will have to pass flag to apic_accept_irq() to let it know that it
>>>>>>> is called synchronously. While all this is possible I want to know
>>>>>>> which guests exactly will we break if we will not track interrupt
>>>>>>> coalescing for lapic timer. If only 2.0 smp kernels will break we
>>>>>>> can probably drop it.
>>>>>> 
>>>>>> RHEL4 / RHEL5 guests.
>>>>> RHEL5 has kvmclock no? We should not break RHEL4 though.
>>>> 
>>>> kvmclock provides no timer interrupt... either LAPIC or PIT must be used
>>>> with kvmclock.
>>> I am confused now. If LAPIC is not used for wallclock time keeping, but
>>> only for scheduling the reinjection is actually harmful. Reinjecting the
>>> interrupt will cause needles task rescheduling. So the question is if
>>> there is a Linux kernel that uses LAPIC for wallclock time keeping and
>>> relies on accurate number of injected interrupts to not time drift.
>> 
>> See 4acd47cfea9c18134e0cbf915780892ef0ff433a on RHEL5, RHEL5 kernels
>> before that commit did not reinject.  Which means that all non-RHEL
>> Linux guests based on that upstream code also suffer from the same
>> problem.
>> 
> The commit actually fixes guest, not host. The existence of the commit
> also means that LAPIC timer reinjection does not solve the problem and
> all guests without this commit will suffer from the bug regardless of
> what we will decide to do here. Without LAPIC timer reinfection the
> effect of the bug will be much more visible and long lasting though.
> 
>> Also any other algorithm which uses LAPIC timers and compare that with
>> other clocks (such as NMI watchdog) are potentially vulnerable.
> They are with or without timer reinjection as commit you pointed to
> shows.
> 
>> 
>> Can drop it, and then wait until someone complains (if so).
>> 
> Yes, tough decision to make. All the complains will be guest bugs which
> can be hit without reinjection too, but with less probability. Why we so
> keen on keeping RTC reinject is that the guests that depends on it
> cannot be fixed.
> 
>>> Knowing that Linux tend to disable interrupt it is likely that it tries
>>> to detect and compensate for missing interrupt.
>> 
>> As said above, any algorithm which compares LAPIC timer interrupt with
>> another clock is vulnerable.
Any conclusion? 

Best regards,
Yang

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html