Re: Lost interrupts with upstream KVM

Jan Kiszka <jan.kiszka@xxxxxxxxxxx> · Fri, 29 May 2009 19:26:52 +0200

Gleb Natapov wrote:
> On Fri, May 29, 2009 at 07:01:44PM +0200, Jan Kiszka wrote:
>> Gleb Natapov wrote:
>>> On Fri, May 29, 2009 at 06:46:47PM +0200, Jan Kiszka wrote:
>>>> Gleb Natapov wrote:
>>>>> On Fri, May 29, 2009 at 04:52:41PM +0200, Jan Kiszka wrote:
>>>>>> Gleb Natapov wrote:
>>>>>>> On Fri, May 29, 2009 at 10:23:24AM +0200, Jan Kiszka wrote:
>>>>>>>> Hi Gleb,
>>>>>>>>
>>>>>>>> with latest kernel modules, namely beginning with 6bc0a1a235 (Remove
>>>>>>>> irq_pending bitmap), I'm loosing interrupts with upstream's KVM support.
>>>>>>>> After some bisecting, hair-pulling and a bit meditation I added a
>>>>>>>> WARN_ON(kvm_cpu_has_interrupt(vcpu)) to kvm_vcpu_ioctl_interrupt, and it
>>>>>>>> actually triggered right before the guest got stuck.
>>>>>>>>
>>>>>>>> This didn't trigger with qemu-kvm (and -no-kvm-irqchip) yet but, on the
>>>>>>>> other hand, I currently do not see a potential bug in upstream's
>>>>>>>> kvm_arch_pre_run. Could you have a look if you can reproduce,
>>>>>>>> specifically if this isn't a KVM kernel issue in the end?
>>>>>>>>
>>>>>>> In kvm_cpu_exec() after calling kvm_arch_pre_run() env->exit_request is
>>>>>>> tested and function can exit without calling kvm_vcpu_ioctl(KVM_RUN).
>>>>>>> Can you check if this what happens in your case?
>>>>>> This path is executed quite frequently here. No obvious correlation with
>>>>>> the lost IRQ.
>>>>>>
>>>>> If kvm_arch_pre_run() injected interrupt kvm_vcpu_ioctl(KVM_RUN) have to
>>>>> be executed before injecting another interrupt, so if on the fist call
>>>>> of kvm_cpu_exec() kvm_arch_pre_run() injected interrupt, but
>>>>> kvm_vcpu_ioctl(KVM_RUN) was not executed because of env->exit_request
>>>>> and on the next kvm_cpu_exec() other interrupt is injected the previous
>>>>> one will be lost.
>>>> ...and kvm_run->ready_for_interrupt_injection is not updated either in
>>>> that case, right? That makes be wonder if KVM_INTERRUPT shouldn't better
>>>> return an error in case the queue is full already.
>>>>
>>> If kvm_vcpu_ioctl(KVM_RUN) is called, but exit happens before interrupt
>>> is injected kvm_run->ready_for_interrupt_injection should be update to
>>> reflect that fact.
>> Yes, but in this case it isn't called if IIUC. So that is the problem
>> upstream KVM faces?
>>
> This is my guest. It tries to inject two different interrupt
> simultaneously and this is not supported (and not correct).
> It can be easily checked if you have reproducible case.
> 
>> Then again: What do you think is the proper long-term fix? Only
>> adjusting upstream KVM (required anyway) or also making the kernel
>> support more robust against this pattern?
> If my guest is correct no fix needed for KVM module (we can enhance
> API to return error as you suggested, but this will not fix buggy
> userspace). You are asking what do I think is the proper long-term
> fix then my answer is: merging qemu-kvm into qemu dropping whatever we
> have there currently ;)

As we won't merge libkvm's structure upstream, we won't see the same
code structure in qemu one day that currently works (correctly) in qemu-kvm.

However, will see if I can fix upstream based on this analysis. Good to
know that we do not have another tricky race in the kvm irq handling part.

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html