On Thu, Apr 13, 2017 at 4:09 PM, Radim Krcmar <rkrcmar@xxxxxxxxxx> wrote: > 2017-04-13 16:32+0800, Peter Xu: >> On Wed, Apr 12, 2017 at 02:20:12PM +0200, Ladi Prosek wrote: >> > On Wed, Apr 12, 2017 at 1:57 PM, Peter Xu <peterx@xxxxxxxxxx> wrote: >> > > On Wed, Apr 12, 2017 at 12:28:25PM +0200, Ladi Prosek wrote: >> > >> On Wed, Apr 12, 2017 at 11:37 AM, Ladi Prosek <lprosek@xxxxxxxxxx> wrote: >> > >> > On Wed, Apr 12, 2017 at 11:06 AM, Peter Xu <peterx@xxxxxxxxxx> wrote: >> > >> >> void kvm_ioapic_update_eoi(struct kvm_vcpu *vcpu, int vector, int trigger_mode) >> > >> >> { >> > >> >> struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic; >> > >> >> >> > >> >> + /* If we'll be using direct EOI, skip broadcast */ >> > >> >> + if (kvm_lapic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI) >> > >> >> + return; >> > >> >> + >> > >> > >> > >> > I've only seen the direct EOI sent for level irqs so I'm afraid that >> > >> > __kvm_ioapic_update_eoi needs to run for edge-triggered even if the >> > >> > APIC_SPIV_DIRECTED_EOI flag is set. >> > > >> > > Yes, if without your patch, it is problematic. But if with your patch, >> > > __kvm_ioapic_update_eoi() will be called in ioapic mmio write then. >> > >> > No, there's no ioapic mmio write for edge-triggered interrupts, at >> > least Windows don't do any. Edge interrupts must still be handled on >> > the lapic EOI path. >> >> I see. Yes it may break that. Though I don't know in what case someone >> would register to this IOAPIC eoi message if it's edge-triggered... > > I don't know of any sane case ... > > KVM PIT (arch/x86/kvm/i8254.c) uses it to keep track of lost edge > interrupts so it could re-inject them. > Windows needed it for reasonable timeflow, IIRC. > > The userspace probably uses it for similar reasons. > > We have to keep the edge notifier working. :( > >> > >> This is what __eoi_ioapic_pin does for version<0x20 and >> > >> on the host side we reset the remote_irr in ioapic_write_indirect if >> > >> I'm reading the code correctly. Wouldn't we want to deliver the >> > >> notification via kvm_notify_acked_irq in this case also? >> > > >> > > I think in that case (EOI sent via "direct EOI" of ioapic mmio >> > > register) the remote_irr is not cleaned in ioapic_write_indirect(), >> > > but it's cleaned in __kvm_ioapic_update_eoi() as well. It has the >> > > line: >> > > >> > > ent->fields.remote_irr = 0; >> > > >> > > Right? >> > >> > Right, but I meant the other case, a guest that sets >> > APIC_SPIV_DIRECTED_EOI but doesn't write to IOAPIC_REG_EOI. With your >> > patch there would be no kvm_notify_acked_irq and that doesn't look >> > correct. >> >> For level triggered irq, if it sets APIC_SPIV_DIRECTED_EOI but doesn't >> write IOAPIC_REG_EOI, it should be a guest bug, right? > > Yes. > >> Thinking about the migration issue that Radim has mentioned, I agree >> maybe we can consider just revert the suppress eoi broadcast. Actually >> now I start to wonder why direct EOI is introduced. I think it should >> be for performance's sake on systems has lots of ioapics? But looks >> like that's normally not the case, at least for kvm and most general >> machines, which only has one ioapic. > > That is what I thought as well. > > I am wondering why Windows uses it -- maybe they remap vector numbers > and therefore need to EOI a different vector? Maybe it has something to do with how they assign devices to the root Hyper-V partition. I've seen the issue only with NICs, if it helps. > The feature might have been introduced because some OS assumed presence > of the feature if x2APIC was available ... > Right, reverting it is pretty safe, because it was broken before. :) > (And we ignore the SPIV completely when forwarding EOI to userspace.) > > Does Windows 2016 work with this patch? Yes, this patch works. > ---8<--- > diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c > index bdff437acbcb..6a1a94d63c52 100644 > --- a/arch/x86/kvm/ioapic.c > +++ b/arch/x86/kvm/ioapic.c > @@ -442,8 +442,7 @@ static void __kvm_ioapic_update_eoi(struct kvm_vcpu *vcpu, > kvm_notify_acked_irq(ioapic->kvm, KVM_IRQCHIP_IOAPIC, i); > spin_lock(&ioapic->lock); > > - if (trigger_mode != IOAPIC_LEVEL_TRIG || > - kvm_lapic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI) > + if (trigger_mode != IOAPIC_LEVEL_TRIG) > continue; > > ASSERT(ent->fields.trig_mode == IOAPIC_LEVEL_TRIG); > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index bad6a25067bc..20fdd93d7dd4 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -319,8 +319,6 @@ void kvm_apic_set_version(struct kvm_vcpu *vcpu) > return; > > feat = kvm_find_cpuid_entry(apic->vcpu, 0x1, 0); > - if (feat && (feat->ecx & (1 << (X86_FEATURE_X2APIC & 31)))) > - v |= APIC_LVR_DIRECTED_EOI; > kvm_lapic_set_reg(apic, APIC_LVR, v); > } > > @@ -1636,8 +1634,6 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val) > > case APIC_SPIV: { > u32 mask = 0x3ff; > - if (kvm_lapic_get_reg(apic, APIC_LVR) & APIC_LVR_DIRECTED_EOI) > - mask |= APIC_SPIV_DIRECTED_EOI; > apic_set_spiv(apic, val & mask); > if (!(val & APIC_SPIV_APIC_ENABLED)) { > int i;