Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section

Marcelo Tosatti <mtosatti@xxxxxxxxxx> · Wed, 21 Apr 2010 12:58:41 -0300

On Wed, Apr 21, 2010 at 03:48:12PM +0800, Yang, Sheng wrote:
> On Tuesday 20 April 2010 23:54:01 Marcelo Tosatti wrote:
> > The assigned device interrupt work handler calls kvm_set_irq, which
> > can sleep, for example, waiting for the ioapic mutex, from irq disabled
> > section.
> > 
> > https://bugzilla.kernel.org/show_bug.cgi?id=15725
> > 
> > Fix by dropping assigned_dev_lock (and re-enabling interrupts)
> > before invoking kvm_set_irq for the KVM_DEV_IRQ_HOST_MSIX case. Other
> > cases do not require the lock or interrupts disabled (a new work
> > instance will be queued in case of concurrent interrupt).
> 
> Looks fine, but depends on the new work would be queued sounds a little 
> tricky...

I think thats guaranteed behaviour, so you can schedule_work() from
within a worker.

> How about a local_irq_disable() at the beginning? It can ensure no concurrent 
> interrupts would happen as well I think.
> 
> > 
> > KVM-Stable-Tag.
> > Signed-off-by: Marcelo Tosatti <mtosatti@xxxxxxxxxx>
> > 
> > diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
> > index 47ca447..7ac7bbe 100644
> > --- a/virt/kvm/assigned-dev.c
> > +++ b/virt/kvm/assigned-dev.c
> > @@ -64,24 +64,33 @@ static void
> >  kvm_assigned_dev_interrupt_work_handler(struct work_struct *work)
> >  interrupt_work);
> >  	kvm = assigned_dev->kvm;
> > 
> > -	spin_lock_irq(&assigned_dev->assigned_dev_lock);
> >  	if (assigned_dev->irq_requested_type & KVM_DEV_IRQ_HOST_MSIX) {
> >  		struct kvm_guest_msix_entry *guest_entries =
> >  			assigned_dev->guest_msix_entries;
> 
> irq_requested_type and guest_msix_entries should also protected by the lock. 
> So how about another spin_lock()/unlock() pair wraps the second kvm_set_irq()?

Don't think its necessary because irq_requested_type and
guest_msix_entries never change once setup.

They only change via deassign_irq, which first disables the IRQ and
flushes pending work.

> > +
> > +		spin_lock_irq(&assigned_dev->assigned_dev_lock);
> >  		for (i = 0; i < assigned_dev->entries_nr; i++) {
> >  			if (!(guest_entries[i].flags &
> >  					KVM_ASSIGNED_MSIX_PENDING))
> >  				continue;
> >  			guest_entries[i].flags &= ~KVM_ASSIGNED_MSIX_PENDING;
> > +			/*
> > + 			 * If kvm_assigned_dev_intr sets pending for an
> > + 			 * entry smaller than this work instance is
> > + 			 * currently processing, a new work instance
> > + 			 * will be queued.
> > + 			 */
> > +			spin_unlock_irq(&assigned_dev->assigned_dev_lock);
> >  			kvm_set_irq(assigned_dev->kvm,
> >  				    assigned_dev->irq_source_id,
> >  				    guest_entries[i].vector, 1);
> > +			spin_lock_irq(&assigned_dev->assigned_dev_lock);
> >  		}
> > +		spin_unlock_irq(&assigned_dev->assigned_dev_lock);
> >  	} else
> >  		kvm_set_irq(assigned_dev->kvm, assigned_dev->irq_source_id,
> >  			    assigned_dev->guest_irq, 1);
> 
> Or could we make kvm_set_irq() atomic? Though the code path is a little long 
> for spinlock.

Yes, given the sleep-inside-RCU-protected section bug from
kvm_notify_acked_irq, either that or convert IRQ locking to SRCU.

But as you said, the code paths are long and potentially slow, so
probably SRCU is a better alternative.

Gleb?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html