2017-10-04 15:56+0800, Wanpeng Li: > 2017-10-04 1:53 GMT+08:00 Radim Krčmář <rkrcmar@xxxxxxxxxx>: > > 2017-09-28 18:04-0700, Wanpeng Li: > >> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> > >> > >> Vectors 0-15 are reserved, and a physical LAPIC - upon sending or > >> receiving one - would generate an APIC error instead of doing the > >> requested action. Make our emulation behave similarly. > >> > >> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> > >> Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx> > >> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> > >> --- > >> arch/x86/kvm/lapic.c | 30 ++++++++++++++++++++++++++++-- > >> 1 file changed, 28 insertions(+), 2 deletions(-) > >> > >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > >> index 6bafd06..a779ba9 100644 > >> --- a/arch/x86/kvm/lapic.c > >> +++ b/arch/x86/kvm/lapic.c > >> @@ -935,6 +935,25 @@ bool kvm_intr_is_single_vcpu_fast(struct kvm *kvm, struct kvm_lapic_irq *irq, > >> return ret; > >> } > >> > >> +static void apic_error(struct kvm_lapic *apic, unsigned long errmask) > >> +{ > >> + uint32_t esr; > >> + > >> + esr = kvm_lapic_get_reg(apic, APIC_ESR); > >> + > >> + if ((esr & errmask) != errmask) { > > > > The spec makes me think that there is going to be only 1 interrupt > > (regardless of the number errors) until the software writes 0 to > > APIC_ESR. Is there a better description than the following 10.5.3? > > > > The ESR is a write/read register. Before attempt to read from the ESR, > > software should first write to it. (The value written does not affect > > the values read subsequently; only zero may be written in x2APIC > > mode.) This write clears any previously logged errors and updates the > > ESR with any errors detected since the last write to the ESR. This > > write also rearms the APIC error interrupt triggering mechanism. > > > > This also describes a different handling of APIC_ESR -- APIC_ESR is > > updated only on software writes to APIC_ESR. All errors in between seem > > to be logged internally (not sure where to migrate it). > > Is there any thing need to be changed in this function? Yes. For the first part, it should really be tested on bare-metal and modelled upon that. SDM mentions some kind of rearming and APM doesn't so we maybe could just send the interrupt every time (if unmasked). And maybe vectors from external interrupts trigger the error too, but we definitely don't need to sort that out immediately. For the second part, the LAPIC error doesn't cause a write to APIC_ESR. We need to add a state for pending errors (and somehow migrate it) that gets copied to APIC_ESR after a write. > >> + uint32_t lvterr = kvm_lapic_get_reg(apic, APIC_LVTERR); > >> + > >> + kvm_lapic_set_reg(apic, APIC_ESR, esr | errmask); > >> + if (!(lvterr & APIC_LVT_MASKED)) { > >> + struct kvm_lapic_irq irq; > >> + > >> + irq.vector = lvterr & 0xff; > >> + kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL); > >> + } > >> + } > >> +} > >> + > >> /* > >> * Add a pending IRQ into lapic. > >> * Return 1 if successfully added and 0 if discarded. > >> @@ -946,6 +965,11 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode, > >> int result = 0; > >> struct kvm_vcpu *vcpu = apic->vcpu; > >> > >> + if (unlikely(vector < 16) && delivery_mode == APIC_DM_FIXED) { > >> + apic_error(apic, APIC_ESR_RECVILL); > > > > The error is also triggered if lowest priority is supported and tries to > > deliver an invalid vector. > > Could you point out this in SDM? :) In section 10.5.3 Error Handling: If the local APIC does not support the sending of lowest-priority IPIs and software writes the ICR to send a lowest-priority IPI with an illegal vector, the local APIC sets only the “redirectible IPI” error bit. Hence, if local APIC does support lowest-priority, then it throws the same error as fixed. (KVM does support lowest-priority.) > > > >> + return 0; > >> + } > >> + > >> trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode, > >> trig_mode, vector); > >> switch (delivery_mode) { > >> @@ -1146,7 +1170,10 @@ static void apic_send_ipi(struct kvm_lapic *apic) > >> irq.trig_mode, irq.level, irq.dest_mode, irq.delivery_mode, > >> irq.vector, irq.msi_redir_hint); > >> > >> - kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL); > >> + if (unlikely(irq.vector < 16 && irq.delivery_mode == APIC_DM_FIXED)) > > > > Please check how APICv self-IPI acceleration behaves, so we're > > consistent. > > There is no vmexit for APICv self-IPI, so I think we can't intercept the vector. Right, so does it deliver the 0-15 vector? If yes, then we should do that as well. Otherwise, where does it save the error flag and does it send an error interrupt? Or do we get a VM exit after all? Thanks.