Re: [PATCH] KVM: nVMX: Fix IRQs inject to L2 which belong to L1 since race

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2014-07-04 08:17, Wanpeng Li wrote:
> On Thu, Jul 03, 2014 at 01:15:26AM -0400, Bandan Das wrote:
>> Jan Kiszka <jan.kiszka@xxxxxxxxxxx> writes:
>>
>>> On 2014-07-02 08:54, Wanpeng Li wrote:
>>>> This patch fix bug https://bugzilla.kernel.org/show_bug.cgi?id=72381 
>>>>
>>>> If we didn't inject a still-pending event to L1 since nested_run_pending,
>>>> KVM_REQ_EVENT should be requested after the vmexit in order to inject the 
>>>> event to L1. However, current log blindly request a KVM_REQ_EVENT even if 
>>>> there is no still-pending event to L1 which blocked by nested_run_pending. 
>>>> There is a race which lead to an interrupt will be injected to L2 which 
>>>> belong to L1 if L0 send an interrupt to L1 during this window. 
>>>>
>>>>                VCPU0                               another thread 
>>>>
>>>> L1 intr not blocked on L2 first entry
>>>> vmx_vcpu_run req event 
>>>> kvm check request req event 
>>>> check_nested_events don't have any intr 
>>>> not nested exit 
>>>>                                             intr occur (8254, lapic timer etc)
>>>> inject_pending_event now have intr 
>>>> inject interrupt 
>>>>
>>>> This patch fix this race by introduced a l1_events_blocked field in nested_vmx 
>>>> which indicates there is still-pending event which blocked by nested_run_pending, 
>>>> and smart request a KVM_REQ_EVENT if there is a still-pending event which blocked 
>>>> by nested_run_pending.
>>>
>>> There are more, unrelated reasons why KVM_REQ_EVENT could be set. Why
>>> aren't those able to trigger this scenario?
>>>
>>> In any case, unconditionally setting KVM_REQ_EVENT seems strange and
>>> should be changed.
>>
>>
>> Ugh! I think I am hitting another one but this one's probably because 
>> we are not setting KVM_REQ_EVENT for something we should.
>>
>> Before this patch, I was able to hit this bug everytime with 
>> "modprobe kvm_intel ept=0 nested=1 enable_shadow_vmcs=0" and then booting 
>> L2. I can verify that I was indeed hitting the race in inject_pending_event.
>>
>> After this patch, I believe I am hitting another bug - this happens 
>> after I boot L2, as above, and then start a Linux kernel compilation
>> and then wait and watch :) It's a pain to debug because this happens
>> almost once in three times; it never happens if I run with ept=1, however,
>> I think that's only because the test completes sooner. But I can confirm
>> that I don't see it if I always set REQ_EVENT if nested_run_pending is set instead of
>> the approach this patch takes.
>> (Any debug hints help appreciated!)
>>
>> So, I am not sure if this is the right fix. Rather, I think the safer thing
>> to do is to have the interrupt pending check for injection into L1 at
>> the "same site" as the call to kvm_queue_interrupt() just like we had before 
>> commit b6b8a1451fc40412c57d1. Is there any advantage to having all the 
>> nested events checks together ?
>>
> 
> How about revert commit b6b8a1451 and try if the bug which you mentioned
> is still there?

I suspect you will have to reset back to b6b8a1451^ for this as other
changes depend on this commit now.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux