Re: [PATCH v4 2/2] KVM: VMX: Add Posted Interrupt supporting

Marcelo Tosatti <mtosatti@xxxxxxxxxx> · Sat, 23 Feb 2013 13:50:34 -0300

On Sat, Feb 23, 2013 at 03:16:11PM +0000, Zhang, Yang Z wrote:
> Marcelo Tosatti wrote on 2013-02-23:
> > On Sat, Feb 23, 2013 at 02:05:28PM +0000, Zhang, Yang Z wrote:
> >> Marcelo Tosatti wrote on 2013-02-23:
> >>> 
> >>> On Fri, Feb 22, 2013 at 09:42:32PM +0800, Yang Zhang wrote:
> >>>> From: Yang Zhang <yang.z.zhang@xxxxxxxxx>
> >>>> 
> >>>> Posted Interrupt allows APIC interrupts to inject into guest directly
> >>>> without any vmexit.
> >>>> 
> >>>> - When delivering a interrupt to guest, if target vcpu is running,
> >>>>   update Posted-interrupt requests bitmap and send a notification
> >>>>   event to the vcpu. Then the vcpu will handle this interrupt
> >>>>   automatically, without any software involvemnt. - If target vcpu is
> >>>>   not running or there already a notification event pending in the
> >>>>   vcpu, do nothing. The interrupt will be handled by next vm entry
> >>>> Signed-off-by: Yang Zhang <yang.z.zhang@xxxxxxxxx>
> >>>> ---
> >>>> 
> >>>> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> >>>> index e4595f1..64616cc 100644
> >>>> --- a/arch/x86/kernel/irq.c
> >>>> +++ b/arch/x86/kernel/irq.c
> >>>> @@ -228,6 +228,26 @@ void smp_x86_platform_ipi(struct pt_regs *regs)
> >>>>  	set_irq_regs(old_regs);
> >>>>  }
> >>>> +#ifdef CONFIG_HAVE_KVM
> >>>> +/*
> >>>> + * Handler for POSTED_INTERRUPT_VECTOR.
> >>>> + */
> >>>> +void smp_kvm_posted_intr_ipi(struct pt_regs *regs)
> >>>> +{
> >>>> +	struct pt_regs *old_regs = set_irq_regs(regs);
> >>>> +
> >>>> +	ack_APIC_irq();
> >>>> +
> >>>> +	irq_enter();
> >>>> +
> >>>> +	exit_idle();
> >>>> +
> >>>> +	irq_exit();
> >>>> +
> >>>> +	set_irq_regs(old_regs);
> >>>> +}
> >>>> +#endif
> >>>> +
> >>> 
> >>> Add per-cpu counters, similarly to other handlers in the same file.
> >> Sure.
> >> 
> >>>> @@ -379,6 +398,7 @@ static inline int apic_find_highest_irr(struct kvm_lapic
> >>> *apic)
> >>>>  	if (!apic->irr_pending) 		return -1;
> >>>>  +	kvm_x86_ops->sync_pir_to_irr(apic->vcpu); 	result =
> >>>>  apic_search_irr(apic); 	ASSERT(result == -1 || result >= 16);
> >>> 
> >>> kvm_x86_ops->sync_pir_to_irr() should be called from vcpu_enter_guest,
> >>> before inject_pending_event.
> >>> 
> >>>         if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
> >>> 		->
> >>> 		inject_pending_event
> >>> 		...
> >>> 	}
> >> Some other places will search irr to do something, like
> > kvm_cpu_has_interrupt(). So we must to sync pir to irr before touch irr, not just
> > before enter guest.
> > 
> > I see. The only call site that needs IRR/PIR information with posted
> > interrupt enabled is kvm_arch_vcpu_runnable, correct?
> Yes.
> 
> > If so, can we contain reading PIR to only that location?
> Yes, we can do it. 
> Actually, why I do pir->irr here is to avoid the wrong usage in future of check pending interrupt just by
> call kvm_vcpu_has_interrupt().

Yes, i see.

> Also, there may need an extra callback to check pir.
> So you think your suggestions is better?

Because it respects standard request processing mechanism, yes.

> > And have PIR->IRR sync at KVM_REQ_EVENT processing only (to respect the
> > standard way of event processing and also reduce reading the PIR).
> Since we will check ON bit before each reading PIR, the cost can be
> ignored.

Note reading ON bit is not OK, because if injection path did not see
vcpu->mode == IN_GUEST_MODE, ON bit will not be set.

> > 
> > So it is possible that a PIR IPI is delivered and handled before guest
> > entry.
> > 
> > By clearing PIR notification bit after local_irq_disable, but before the
> > last check for vcpu->requests, we guarantee that a PIR IPI is never lost.
> > 
> > Makes sense? (please check the logic, it might be flawed).
> I am not sure whether I understand you. But I don't think the IPI will lost in current patch:
> 
> if (!pi_test_and_set_on(&vmx->pi_desc) && (vcpu->mode == IN_GUEST_MODE)) {
>                 kvm_make_request(KVM_REQ_EVENT, vcpu);
>                 apic->send_IPI_mask(get_cpu_mask(vcpu->cpu),
>                                     POSTED_INTR_VECTOR);
>                 *delivered = true;
> }
> 
> vcpu entry has: 
> vcpu->mode = GUEST_MODE
> local irq disable
> check request
> 
> We will send the IPI after making request, if IPI is received after set guest_mode before disable irq, then it still will be handled by the following check request.
> Please correct me if I am wrong.

cpu0					cpu1		vcpu0
test_and_set_bit(PIR-A)					
set KVM_REQ_EVENT					
							process REQ_EVENT
							PIR-A->IRR

							vcpu->mode=IN_GUEST

if (vcpu0->guest_mode)
	if (!t_a_s_bit(PIR notif))			
		send IPI
							linux_pir_handler

					t_a_s_b(PIR-B)=1
					no PIR IPI sent

> > 
> > Sync PIR->IRR uses xchg, which is locked and atomic, so can't see the
> > need for the spinlock there.
> In function sync_pir_to_irr():
> tmp_pir=xchg(pir,0)
> xchg(tmp_pir, irr)
> 
> It is not atomically, the lock still is needed.

IRR is only accessed by local vcpu context, only PIR is accessed concurrently,
therefore only PIR accesses must be atomic. xchg instruction is 
locked and atomic.

+void kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir)
+{
+       u32 i, pir_val;
+       struct kvm_lapic *apic = vcpu->arch.apic;
+
+       for (i = 0; i <= 7; i++) {
+               pir_val = xchg(&pir[i], 0);
+               if (pir_val)
+                       *((u32 *)(apic->regs + APIC_IRR + i * 0x10)) |= pir_val;
+       }
+}
+EXPORT_SYMBOL_GPL(kvm_apic_update_irr);

Right?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html