Re: [PATCH v4 2/2] KVM: VMX: Add Posted Interrupt supporting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Feb 23, 2013 at 02:05:28PM +0000, Zhang, Yang Z wrote:
> Marcelo Tosatti wrote on 2013-02-23:
> > 
> > On Fri, Feb 22, 2013 at 09:42:32PM +0800, Yang Zhang wrote:
> >> From: Yang Zhang <yang.z.zhang@xxxxxxxxx>
> >> 
> >> Posted Interrupt allows APIC interrupts to inject into guest directly
> >> without any vmexit.
> >> 
> >> - When delivering a interrupt to guest, if target vcpu is running,
> >>   update Posted-interrupt requests bitmap and send a notification event
> >>   to the vcpu. Then the vcpu will handle this interrupt automatically,
> >>   without any software involvemnt.
> >> - If target vcpu is not running or there already a notification event
> >>   pending in the vcpu, do nothing. The interrupt will be handled by
> >>   next vm entry
> >> Signed-off-by: Yang Zhang <yang.z.zhang@xxxxxxxxx>
> >> ---
> >> 
> >> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> >> index e4595f1..64616cc 100644
> >> --- a/arch/x86/kernel/irq.c
> >> +++ b/arch/x86/kernel/irq.c
> >> @@ -228,6 +228,26 @@ void smp_x86_platform_ipi(struct pt_regs *regs)
> >>  	set_irq_regs(old_regs);
> >>  }
> >> +#ifdef CONFIG_HAVE_KVM
> >> +/*
> >> + * Handler for POSTED_INTERRUPT_VECTOR.
> >> + */
> >> +void smp_kvm_posted_intr_ipi(struct pt_regs *regs)
> >> +{
> >> +	struct pt_regs *old_regs = set_irq_regs(regs);
> >> +
> >> +	ack_APIC_irq();
> >> +
> >> +	irq_enter();
> >> +
> >> +	exit_idle();
> >> +
> >> +	irq_exit();
> >> +
> >> +	set_irq_regs(old_regs);
> >> +}
> >> +#endif
> >> +
> > 
> > Add per-cpu counters, similarly to other handlers in the same file.
> Sure.
> 
> >> @@ -379,6 +398,7 @@ static inline int apic_find_highest_irr(struct kvm_lapic
> > *apic)
> >>  	if (!apic->irr_pending)
> >>  		return -1;
> >> +	kvm_x86_ops->sync_pir_to_irr(apic->vcpu);
> >>  	result = apic_search_irr(apic);
> >>  	ASSERT(result == -1 || result >= 16);
> > 
> > kvm_x86_ops->sync_pir_to_irr() should be called from vcpu_enter_guest,
> > before inject_pending_event.
> > 
> >         if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
> > 		->
> > 		inject_pending_event
> > 		...
> > 	}
> Some other places will search irr to do something, like kvm_cpu_has_interrupt(). So we must to sync pir to irr before touch irr, not just before enter guest.

I see. The only call site that needs IRR/PIR information with posted
interrupt enabled is kvm_arch_vcpu_runnable, correct?

If so, can we contain reading PIR to only that location?

And have PIR->IRR sync at KVM_REQ_EVENT processing only (to respect the
standard way of event processing and also reduce reading the PIR).

> >> +{
> >> +	__set_bit(vector, (unsigned long *)pi_desc->pir);
> >> +}
> > 
> > This must be locked atomic operation (set_bit).
> If using spinlock, pi_set_pir is not called concurrently. It safe to use _set_bit.

The manual demands atomic locked accesses (remember this a remote
access). See the posted interrupt page.

> > 
> > Don't see why the lock is necessary. Please use the following
> > pseudocode for vmx_deliver_posted_interrupt:
> > 
> > vmx_deliver_posted_interrupt(), returns 'bool injected'.
> > 
> > 1. orig_irr = read irr from vapic page
> > 2. if (orig_irr != 0)
> > 3.	return false;
> > 4. kvm_make_request(KVM_REQ_EVENT)
> > 5. bool injected = !test_and_set_bit(PIR)
> > 6. if (vcpu->guest_mode && injected)
> > 7. 	if (test_and_set_bit(PIR notification bit))
> > 8.		send PIR IPI
> > 9. return injected
> 
> Consider follow case:
> vcpu 0                      |                vcpu1
> send intr to vcpu1
> check irr
>                                             receive a posted intr
> 					      pir->irr(pir is cleared, irr is set)
> injected=test_and_set_bit=true
> pir=set
> 
> Then both irr and pir have the interrupt pending, they may merge to one later, but injected reported as true. Wrong.

True. Have a lock around it and at step 1 also verify if PIR is set. That
would do it, yes?

> > On vcpu_enter_guest:
> > 
> > if (kvm_check_request(KVM_REQ_EVENT))  {
> > 	pir->virr sync			(*)
> > 	...
> > }
> > ...
> > vcpu->mode = IN_GUEST_MODE;
> > local_irq_disable
> > clear pir notification bit unconditionally (*)
> Why clear ON bit here? If there is no pir->irr operation in this vmentry, clear on here is redundant. 

A PIR IPI must not be lost. Because if its lost, then interrupt
injection can be delayed while it must be performed immediately.

vcpu entry path has:
1. vcpu->mode = GUEST_MODE
2. local_irq_disable

injection path has:
1. if (vcpu->guest_mode && test_and_set_bit(PIR notif bit))
	send IPI

So it is possible that a PIR IPI is delivered and handled before guest
entry.

By clearing PIR notification bit after local_irq_disable, but before
the last check for vcpu->requests, we guarantee that a PIR IPI is never
lost.

Makes sense? (please check the logic, it might be flawed).

> > 
> > Please confirm whether a spinlock is necessary with the pseudocode above.
> In current implementation, spinlock is necessary to avoid race condition between vcpus when delivery posted interrupt and sync pir->irr.

Sync PIR->IRR uses xchg, which is locked and atomic, so can't see the
need for the spinlock there.

About serializing concurrent injectors, yes, you're right. OK then.

Please use the pseucode with a spinlock (the order of setting
KVM_REQ_EVENT etc must be there).

> >> +static void vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
> >> +{
> >> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> >> +	unsigned long flags;
> >> +
> >> +	if (!vmx_vm_has_apicv(vcpu->kvm))
> >> +		return;
> >> +
> >> +	spin_lock_irqsave(&vmx->pi_lock, flags);
> > 
> > 
> >> +	if (!pi_test_and_clear_on(&vmx->pi_desc)) {
> > 
> > This needs to move to vcpu_enter_guest, after local_irq_disabled,
> > unconditionally, as noted.
> > 
> >> @@ -2679,6 +2679,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> >>  static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu, 				   
> >>  struct kvm_lapic_state *s) { +	kvm_x86_ops->sync_pir_to_irr(vcpu);
> >>  	memcpy(s->regs, vcpu->arch.apic->regs, sizeof *s);
> >>  
> >>  	return 0;
> >> --
> >> 1.7.1
> >
> 
> 
> Best regards,
> Yang
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux