2017-04-13 08:36+0200, Ladi Prosek: > On Wed, Apr 12, 2017 at 10:52 PM, Radim Krcmar <rkrcmar@xxxxxxxxxx> wrote: >> 2017-04-12 17:06+0800, Peter Xu: >>> On Wed, Apr 12, 2017 at 09:36:58AM +0200, Ladi Prosek wrote: >>> > On Wed, Apr 12, 2017 at 8:40 AM, Peter Xu <peterx@xxxxxxxxxx> wrote: >>> > > On Tue, Apr 11, 2017 at 04:11:15PM +0200, Ladi Prosek wrote: >>> > >> If the guest takes advantage of the directed EOI feature by setting >>> > >> APIC_SPIV_DIRECTED_EOI, it is expected to signal EOI by writing to >>> > >> the EOI register of the respective IOAPIC. >>> > >> >>> > >> From Intel's x2APIC Specification: >>> > >> "following the EOI to the local x2APIC unit for a level triggered >>> > >> interrupt, perform a directed EOI to the IOxAPIC generating the >>> > >> interrupt by writing to its EOI register." >>> > >> >>> > >> Commit fc61b800f950 ("KVM: Add Directed EOI support to APIC emulation") >>> > >> inhibited EOI on LAPIC EOI register write but didn't add the IOAPIC >>> > >> part. IOAPIC_REG_EOI writes were handled only on IA64 and the code >>> > >> was later removed with the rest of IA64 support. >>> > >> >>> > >> The bug has gone undetected for a long time because Linux writes to >>> > >> IOAPIC_REG_EOI only if the IOAPIC version is >=0x20. Windows doesn't >>> > >> seem to perform such a check. >>> > > >>> > > Hi, Ladi, >>> > >>> > Hi Peter, >>> > >>> > > Not sure I'm understanding it correctly... I see "direct EOI" is a >>> > > feature for IOAPIC version 0x20, while "suppress EOI-broadcast" is >>> > > another feature for APIC. They are not the same feature, so it may not >>> > > be required to have them all together. IIUC current x86 kvm is just >>> > > the case - it supports EOI broadcast suppression on APIC, but it does >>> > > not support direct EOI on kernel IOAPIC. >>> > >>> > Thanks, that makes perfect sense and explains why Linux behaves the >>> > way it does (__eoi_ioapic_pin in arch/x86/kernel/apic/io_apic.c). >>> > >>> > This document makes it look like suppress EOI-broadcast always implies >>> > directed EOI, though: >>> > >>> > http://www.intel.com/content/dam/doc/specification-update/64-architecture-x2apic-specification.pdf >>> > >>> > NB "The support for Directed EOI capability can be detected by means >>> > of bit 24 in the Local APIC Version Register. " >>> > >>> > There is no mention of APIC version or any other detection mechanism >>> > for directed EOI. Maybe the chip being x2APIC implies version >= 0x20 >>> > but I don't see that in the document either. >>> > >>> > I suspect that Microsoft implemented EOI by following this same spec. >>> > Level-triggered interrupts don't work right on Windows Server 2016 >>> > with Hyper-V enabled without this patch. >>> >>> Yes, the documents for IOAPIC is always hard to find, at least for >>> me... >>> >>> There is some pages mentioned IOAPIC in ICH9 manual on chap 13.5 here: >>> http://www.intel.com/content/dam/doc/datasheet/io-controller-hub-9-datasheet.pdf >>> >>> However I see nothing related to how the IOAPIC version is defined. In >>> that sense, the comment above __eoi_ioapic_pin() seems to be better. :) >> >> Yeah, it is officially described in ICH9 datasheet as: >> >> 13.5.6 VER—Version Register (LPC I/F—D31:F0) >> Default Value: 00170020h >> >> The one we emulate in KVM is in 82093AA datasheet: >> >> 3.2.2. IOAPICVER—IOAPIC VERSION REGISTER >> Default Value: 00170011h >> >> The former has the EOI register, the latter doesn't. > > Got it. Do you want me to resubmit with a different commit message and > a comment blaming the guest OS instead of calling it a KVM bug? :) Or > do you think it's worth exploring Peter's suggestion to make a more > invasive fix? I would experiment with reverting the APIC suppress-EOI-broadcast first. It doesn't currently work well and is pointless in theory. >> --- >> I don't like the idea behind the patch, but it is acceptable and >> thinking about good solutions gets us into compatibility nightmares ... >> (We could remove support for directed EOI, because it is a detectable >> feature and makes little sense in KVM, or we could implement the IOAPIC >> version 0x20, but both would be tricky to migrate.) >> >> People should switch to userspace IOAPIC anyway. :) > > For the record, QEMU with kernel-irqchip=split works fine as it > emulates version 0x20 with the IOAPIC_EOI register by default. Great, thanks. > kernel-irqchip=off does not seem to work, I will look into it. Well, good luck. :)