On Fri, May 23, 2014 at 04:51:53PM +0200, Paolo Bonzini wrote: > When Hyper-V enlightenments are in effect, Windows prefers to issue an > Hyper-V MSR write to issue an EOI rather than an x2apic MSR write. > The Hyper-V MSR write is not handled by the processor, and besides > being slower, this also causes bugs with APIC virtualization. The > reason is that on EOI the processor will modify the highest in-service > interrupt (SVI) field of the VMCS, as explained in section 29.1.4 of > the SDM. > > We need to do the same, and be careful not to muck with the isr_count > and highest_isr_cache fields that are unused when virtual interrupt > delivery is enabled. > > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> FWIW Acked-by: Michael S. Tsirkin <mst@xxxxxxxxxx> some questions below. > --- > arch/x86/kvm/lapic.c | 62 ++++++++++++++++++++++++++++++++++--------------- > 1 files changed, 43 insertions(+), 19 deletions(-) > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index 9736529..0069118 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -360,6 +360,8 @@ static inline void apic_clear_irr(int vec, struct kvm_lapic *apic) > > static inline void apic_set_isr(int vec, struct kvm_lapic *apic) > { > + /* Note that we never get here with APIC virtualization enabled. */ > + > if (!__apic_test_and_set_vector(vec, apic->regs + APIC_ISR)) > ++apic->isr_count; > BUG_ON(apic->isr_count > MAX_APIC_VECTOR); > @@ -371,12 +373,48 @@ static inline void apic_set_isr(int vec, struct kvm_lapic *apic) > apic->highest_isr_cache = vec; > } > > +static inline int apic_find_highest_isr(struct kvm_lapic *apic) > +{ > + int result; > + > + /* > + * Note that isr_count is always 1, and highest_isr_cache > + * is always -1, with APIC virtualization enabled. > + */ > + if (!apic->isr_count) > + return -1; > + if (likely(apic->highest_isr_cache != -1)) > + return apic->highest_isr_cache; > + > + result = find_highest_vector(apic->regs + APIC_ISR); > + ASSERT(result == -1 || result >= 16); > + > + return result; > +} > + > static inline void apic_clear_isr(int vec, struct kvm_lapic *apic) > { > - if (__apic_test_and_clear_vector(vec, apic->regs + APIC_ISR)) > + struct kvm_vcpu *vcpu; > + if (!__apic_test_and_clear_vector(vec, apic->regs + APIC_ISR)) > + return; > + > + vcpu = apic->vcpu; > + > + /* > + * We do get here for APIC virtualization enabled if the guest > + * uses the Hyper-V APIC enlightenment. In this case we may need > + * to trigger a new interrupt delivery by writing the SVI field; > + * on the other hand isr_count and highest_isr_cache are unused > + * and must be left alone. > + */ > + if (unlikely(kvm_apic_vid_enabled(vcpu->kvm))) > + kvm_x86_ops->hwapic_isr_update(vcpu->kvm, > + apic_find_highest_isr(apic)); I note that an indirect call on data path is mostly unnecessary here: static int vmx_vm_has_apicv(struct kvm *kvm) { return enable_apicv && irqchip_in_kernel(kvm); } is all it does, and irqchip_in_kernel also has an rmb within it, which is somewhat expensive on x86: and there's no way to reach this code with irqchip disabled, correct? So this is penalizing non-apicv boxes. How about adding a bool flag in kvm_vcpu_arch, and testing that? Or even move the enable_apicv module parameter to x86. > + else { > --apic->isr_count; > - BUG_ON(apic->isr_count < 0); > - apic->highest_isr_cache = -1; > + BUG_ON(apic->isr_count < 0); > + apic->highest_isr_cache = -1; > + } > } > > int kvm_lapic_find_highest_irr(struct kvm_vcpu *vcpu) > @@ -456,22 +494,6 @@ static void pv_eoi_clr_pending(struct kvm_vcpu *vcpu) > __clear_bit(KVM_APIC_PV_EOI_PENDING, &vcpu->arch.apic_attention); > } > > -static inline int apic_find_highest_isr(struct kvm_lapic *apic) > -{ > - int result; > - > - /* Note that isr_count is always 1 with vid enabled */ > - if (!apic->isr_count) > - return -1; > - if (likely(apic->highest_isr_cache != -1)) > - return apic->highest_isr_cache; > - > - result = find_highest_vector(apic->regs + APIC_ISR); > - ASSERT(result == -1 || result >= 16); > - > - return result; > -} > - > void kvm_apic_update_tmr(struct kvm_vcpu *vcpu, u32 *tmr) > { > struct kvm_lapic *apic = vcpu->arch.apic; > @@ -1605,6 +1627,8 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu) > int vector = kvm_apic_has_interrupt(vcpu); > struct kvm_lapic *apic = vcpu->arch.apic; > > + /* Note that we never get here with APIC virtualization enabled. */ > + > if (vector == -1) > return -1; > > -- > 1.7.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html