Marcelo Tosatti wrote on 2013-02-24: > On Sat, Feb 23, 2013 at 03:16:11PM +0000, Zhang, Yang Z wrote: >> Marcelo Tosatti wrote on 2013-02-23: >>> On Sat, Feb 23, 2013 at 02:05:28PM +0000, Zhang, Yang Z wrote: >>>> Marcelo Tosatti wrote on 2013-02-23: >>>>> >>>>> On Fri, Feb 22, 2013 at 09:42:32PM +0800, Yang Zhang wrote: >>>>>> From: Yang Zhang <yang.z.zhang@xxxxxxxxx> >>>>>> >>>>>> Posted Interrupt allows APIC interrupts to inject into guest directly >>>>>> without any vmexit. >>>>>> >>>>>> - When delivering a interrupt to guest, if target vcpu is running, >>>>>> update Posted-interrupt requests bitmap and send a notification >>>>>> event to the vcpu. Then the vcpu will handle this interrupt >>>>>> automatically, without any software involvemnt. - If target vcpu is >>>>>> not running or there already a notification event pending in the >>>>>> vcpu, do nothing. The interrupt will be handled by next vm entry >>>>>> Signed-off-by: Yang Zhang <yang.z.zhang@xxxxxxxxx> >>>>>> --- >>>>>> >>>>>> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c >>>>>> index e4595f1..64616cc 100644 >>>>>> --- a/arch/x86/kernel/irq.c >>>>>> +++ b/arch/x86/kernel/irq.c >>>>>> @@ -228,6 +228,26 @@ void smp_x86_platform_ipi(struct pt_regs *regs) >>>>>> set_irq_regs(old_regs); >>>>>> } >>>>>> +#ifdef CONFIG_HAVE_KVM >>>>>> +/* >>>>>> + * Handler for POSTED_INTERRUPT_VECTOR. >>>>>> + */ >>>>>> +void smp_kvm_posted_intr_ipi(struct pt_regs *regs) >>>>>> +{ >>>>>> + struct pt_regs *old_regs = set_irq_regs(regs); >>>>>> + >>>>>> + ack_APIC_irq(); >>>>>> + >>>>>> + irq_enter(); >>>>>> + >>>>>> + exit_idle(); >>>>>> + >>>>>> + irq_exit(); >>>>>> + >>>>>> + set_irq_regs(old_regs); >>>>>> +} >>>>>> +#endif >>>>>> + >>>>> >>>>> Add per-cpu counters, similarly to other handlers in the same file. >>>> Sure. >>>> >>>>>> @@ -379,6 +398,7 @@ static inline int apic_find_highest_irr(struct > kvm_lapic >>>>> *apic) >>>>>> if (!apic->irr_pending) return -1; >>>>>> + kvm_x86_ops->sync_pir_to_irr(apic->vcpu); result = >>>>>> apic_search_irr(apic); ASSERT(result == -1 || result >= 16); >>>>> >>>>> kvm_x86_ops->sync_pir_to_irr() should be called from vcpu_enter_guest, >>>>> before inject_pending_event. >>>>> >>>>> if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) > { >>>>> -> >>>>> inject_pending_event >>>>> ... >>>>> } >>>> Some other places will search irr to do something, like >>> kvm_cpu_has_interrupt(). So we must to sync pir to irr before touch >>> irr, not just before enter guest. >>> >>> I see. The only call site that needs IRR/PIR information with posted >>> interrupt enabled is kvm_arch_vcpu_runnable, correct? >> Yes. >> >>> If so, can we contain reading PIR to only that location? >> Yes, we can do it. Actually, why I do pir->irr here is to avoid the >> wrong usage in future of check pending interrupt just by call >> kvm_vcpu_has_interrupt(). > > Yes, i see. > >> Also, there may need an extra callback to check pir. >> So you think your suggestions is better? > > Because it respects standard request processing mechanism, yes. Sure. Will change it in next patch. >>> And have PIR->IRR sync at KVM_REQ_EVENT processing only (to respect the >>> standard way of event processing and also reduce reading the PIR). >> Since we will check ON bit before each reading PIR, the cost can be >> ignored. > > Note reading ON bit is not OK, because if injection path did not see > vcpu->mode == IN_GUEST_MODE, ON bit will not be set. In my patch, It will set ON bit before check vcpu->mode. So it's ok. Actually, I think the ON bit must be set unconditionally after change PIR regardless of vcpu->mode. >>> >>> So it is possible that a PIR IPI is delivered and handled before guest >>> entry. >>> >>> By clearing PIR notification bit after local_irq_disable, but before the >>> last check for vcpu->requests, we guarantee that a PIR IPI is never lost. >>> >>> Makes sense? (please check the logic, it might be flawed). >> I am not sure whether I understand you. But I don't think the IPI will >> lost in current patch: >> >> if (!pi_test_and_set_on(&vmx->pi_desc) && (vcpu->mode == > IN_GUEST_MODE)) { >> kvm_make_request(KVM_REQ_EVENT, vcpu); >> apic->send_IPI_mask(get_cpu_mask(vcpu->cpu), >> POSTED_INTR_VECTOR); >> *delivered = true; >> } >> >> vcpu entry has: >> vcpu->mode = GUEST_MODE >> local irq disable >> check request >> >> We will send the IPI after making request, if IPI is received after set > guest_mode before disable irq, then it still will be handled by the following check > request. >> Please correct me if I am wrong. > > cpu0 cpu1 vcpu0 > test_and_set_bit(PIR-A) > set KVM_REQ_EVENT > process REQ_EVENT > PIR-A->IRR > > vcpu->mode=IN_GUEST > > if (vcpu0->guest_mode) > if (!t_a_s_bit(PIR notif)) > send IPI > linux_pir_handler > > t_a_s_b(PIR-B)=1 > no PIR IPI sent No, this exists with your implementation not mine. As I said, in this patch, it will make request after vcpu ==guest mode, then send ipi. Even the ipi is lost, but the request still will be tracked. Because we have the follow check: vcpu->mode = GUEST_MODE (ipi may arrived here and lost) local irq disable check request (this will ensure the pir modification will be picked up by vcpu before vmentry) >>> >>> Sync PIR->IRR uses xchg, which is locked and atomic, so can't see the >>> need for the spinlock there. >> In function sync_pir_to_irr(): >> tmp_pir=xchg(pir,0) >> xchg(tmp_pir, irr) >> >> It is not atomically, the lock still is needed. > > IRR is only accessed by local vcpu context, only PIR is accessed concurrently, > therefore only PIR accesses must be atomic. xchg instruction is > locked and atomic. There has same problem as we discussed before. Consider this: Before delivery poste ipi, the irr is 0. then: cpu0 cpu1 vcpu0 delivery_posted_intr() sync_pir_toirr(): tmp_pir=xchg(pir,0)(pir is cleared) check pir(pass) check irr(pass) xchg(tmp_pir, irr) injected= true set pir Same problem: both pir and irr is set, but it reported as injected. Still need the lock to protect it. > +void kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir) +{ + > u32 i, pir_val; + struct kvm_lapic *apic = vcpu->arch.apic; + + > for (i = 0; i <= 7; i++) { + pir_val = xchg(&pir[i], > 0); + if (pir_val) + *((u32 > *)(apic->regs + APIC_IRR + i * 0x10)) |= pir_val; + } +} > +EXPORT_SYMBOL_GPL(kvm_apic_update_irr); > > Right? > Best regards, Yang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html