Re: [Qemu-devel] E5-2620v2 - emulation stop error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 30, 2015 at 9:56 PM, Radim Krčmář <rkrcmar@xxxxxxxxxx> wrote:
> 2015-03-27 13:16+0300, Andrey Korolyov:
>> On Fri, Mar 27, 2015 at 12:03 AM, Bandan Das <bsd@xxxxxxxxxx> wrote:
>> > Radim Krčmář <rkrcmar@xxxxxxxxxx> writes:
>> >> I second Bandan -- checking that it reproduces on other machine would be
>> >> great for sanity :)  (Although a bug in our APICv is far more likely.)
>> >
>> > If it's APICv related, a run without apicv enabled could give more hints.
>> >
>> > Your "devices not getting reset" hypothesis makes the most sense to me,
>> > maybe the timer vector in the error message is just one part of
>> > the whole story. Another misbehaving interrupt from the dark comes in at the
>> > same time and leads to a double fault.
>>
>> Default trace (APICv enabled, first reboot introduced the issue):
>> http://xdel.ru/downloads/kvm-e5v2-issue/hanged-reboot-apic-on.dat.gz
>
> The relevant part is here,
> prefixed with "qemu-system-x86-4180  [002]   697.111550:"
>
>   kvm_exit:             reason CR_ACCESS rip 0xd272 info 0 0
>   kvm_cr:               cr_write 0 = 0x10
>   kvm_mmu_get_page:     existing sp gfn 0 0/4 q0 direct --- !pge !nxe root 0 sync
>   kvm_entry:            vcpu 0
>   kvm_emulate_insn:     f0000:d275: ea 7a d2 00 f0
>   kvm_emulate_insn:     f0000:d27a: 2e 0f 01 1e f0 6c
>   kvm_emulate_insn:     f0000:d280: 31 c0
>   kvm_emulate_insn:     f0000:d282: 8e e0
>   kvm_emulate_insn:     f0000:d284: 8e e8
>   kvm_emulate_insn:     f0000:d286: 8e c0
>   kvm_emulate_insn:     f0000:d288: 8e d8
>   kvm_emulate_insn:     f0000:d28a: 8e d0
>   kvm_entry:            vcpu 0
>   kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xd28f info 0 800000f6
>   kvm_entry:            vcpu 0
>   kvm_exit:             reason EPT_VIOLATION rip 0x8dd0 info 184 0
>   kvm_page_fault:       address f8dd0 error_code 184
>   kvm_entry:            vcpu 0
>   kvm_exit:             reason EXTERNAL_INTERRUPT rip 0x8dd0 info 0 800000f6
>   kvm_entry:            vcpu 0
>   kvm_exit:             reason EPT_VIOLATION rip 0x76d6 info 184 0
>   kvm_page_fault:       address f76d6 error_code 184
>   kvm_entry:            vcpu 0
>   kvm_exit:             reason EXTERNAL_INTERRUPT rip 0x76d6 info 0 800000f6
>   kvm_entry:            vcpu 0
>   kvm_exit:             reason PENDING_INTERRUPT rip 0xd331 info 0 0
>   kvm_inj_virq:         irq 8
>   kvm_entry:            vcpu 0
>   kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xfea5 info 0 800000f6
>   kvm_entry:            vcpu 0
>   kvm_exit:             reason EPT_VIOLATION rip 0xfea5 info 184 0
>   kvm_page_fault:       address ffea5 error_code 184
>   kvm_entry:            vcpu 0
>   kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xfea5 info 0 800000f6
>   kvm_entry:            vcpu 0
>   kvm_exit:             reason EPT_VIOLATION rip 0xe990 info 184 0
>   kvm_page_fault:       address fe990 error_code 184
>   kvm_entry:            vcpu 0
>   kvm_exit:             reason EXTERNAL_INTERRUPT rip 0xe990 info 0 800000f6
>   kvm_entry:            vcpu 0
>   kvm_exit:             reason EXCEPTION_NMI rip 0xd334 info 0 80000b0d
>   kvm_userspace_exit:   reason KVM_EXIT_INTERNAL_ERROR (17)
>
>> Trace without APICv (three reboots, just to make sure to hit the
>> problematic condition of supposed DF, as it still have not one hundred
>> percent reproducibility):
>> http://xdel.ru/downloads/kvm-e5v2-issue/apic-off.dat.gz
>
> The trace here contains a well matching excerpt, just instead of the
> EXCEPTION_NMI, it does
>
>  169.905098: kvm_exit:             reason EPT_VIOLATION rip 0xd334 info 181 0
>  169.905102: kvm_page_fault:       address feffd066 error_code 181
>
> and works.  Page fault says we tried to read 0xfeffd066 -- probably IOPB
> of TSS.  (I guess it is pre-fetch for following IO instruction.)
>
> Nothing strikes me when looking at it, but some APICv boots don't fail,
> so it would be interesting to compare them ... hosts's 0xf6 interrupt
> (IRQ_WORK_VECTOR) is a possible source of races.  (We could look more
> closely.  It is fired too often for my liking as well.)


Thanks Radim, http://xdel.ru/downloads/kvm-e5v2-issue/no-fail-with-apicv.dat.gz

(missed right button in mailer previously)

The related bits looks the same as with enable_apicv=0 for me.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux