On 29/03/19 15:40, Vitaly Kuznetsov wrote: > Paolo Bonzini <pbonzini@xxxxxxxxxx> writes: > >> On 28/03/19 21:31, Vitaly Kuznetsov wrote: >>> >>> The 'hang' scenario develops like this: >>> 1) Hyper-V boots and QEMU is trying to inject two irq simultaneously. One >>> of them is level-triggered. KVM injects the edge-triggered one and >>> requests immediate exit to inject the level-triggered: >>> >>> kvm_set_irq: gsi 23 level 1 source 0 >>> kvm_msi_set_irq: dst 0 vec 80 (Fixed|physical|level) >>> kvm_apic_accept_irq: apicid 0 vec 80 (Fixed|edge) >>> kvm_msi_set_irq: dst 0 vec 96 (Fixed|physical|edge) >>> kvm_apic_accept_irq: apicid 0 vec 96 (Fixed|edge) >>> kvm_nested_vmexit_inject: reason EXTERNAL_INTERRUPT info1 0 info2 0 int_info 80000060 int_info_err 0 >>> >>> 2) Hyper-V requires one of its VMs to run to handle the situation but >>> immediate exit happens: >>> >>> kvm_entry: vcpu 0 >>> kvm_exit: reason VMRESUME rip 0xfffff80006a40115 info 0 0 >>> kvm_entry: vcpu 0 >>> kvm_exit: reason PREEMPTION_TIMER rip 0xfffff8022f3d8350 info 0 0 >>> kvm_nested_vmexit: rip fffff8022f3d8350 reason PREEMPTION_TIMER info1 0 info2 0 int_info 0 int_info_err 0 >>> kvm_nested_vmexit_inject: reason EXTERNAL_INTERRUPT info1 0 info2 0 int_info 80000050 int_info_err 0 >> >> I supposed before this there was an eoi for vector 96? > > AFAIR: no, it seems that it is actually the VM it is trying to resume > (Windows partition?) which needs to do some work and with the preemtion > timer of 0 we don't allow it to. kvm_apic_accept_irq placed IRQ 96 in IRR, and Hyper-V should be running with "acknowledge interrupt on exit" since int_info is nonzero in kvm_nested_vmexit_inject. Therefore, at the kvm_nested_vmexit_inject tracepoint KVM should have set bit 96 in ISR; and because PPR is now 96, interrupt 80 should have never been delivered. Unless 96 is an auto-EOI interrupt, in which case this comment would apply /* * For auto-EOI interrupts, there might be another pending * interrupt above PPR, so check whether to raise another * KVM_REQ_EVENT. */ IIRC there was an enlightenment to tell Windows "I support auto-EOI but please don't use it". If this is what's happening, that would also fix it. Thanks, Paolo