On Thu, Apr 22, 2010 at 04:40:30PM -0300, Marcelo Tosatti wrote: > On Thu, Apr 22, 2010 at 09:11:30PM +0300, Gleb Natapov wrote: > > On Thu, Apr 22, 2010 at 01:40:38PM -0300, Marcelo Tosatti wrote: > > > On Wed, Apr 21, 2010 at 09:38:39PM +0300, Gleb Natapov wrote: > > > > On Wed, Apr 21, 2010 at 03:29:11PM -0300, Marcelo Tosatti wrote: > > > > > On Wed, Apr 21, 2010 at 08:58:48PM +0300, Gleb Natapov wrote: > > > > > > On Wed, Apr 21, 2010 at 02:37:34PM -0300, Marcelo Tosatti wrote: > > > > > > > On Wed, Apr 21, 2010 at 08:12:27PM +0300, Gleb Natapov wrote: > > > > > > > > On Wed, Apr 21, 2010 at 12:58:41PM -0300, Marcelo Tosatti wrote: > > > > > > > > > > Or could we make kvm_set_irq() atomic? Though the code path is a little long > > > > > > > > > > for spinlock. > > > > > > > > > > > > > > > > > > Yes, given the sleep-inside-RCU-protected section bug from > > > > > > > > > kvm_notify_acked_irq, either that or convert IRQ locking to SRCU. > > > > > > > > > > > > > > > > > > But as you said, the code paths are long and potentially slow, so > > > > > > > > > probably SRCU is a better alternative. > > > > > > > > > > > > > > > > > > Gleb? > > > > > > > > kvm_set_irq() was converted to rcu from mutex to make msix interrupt > > > > > > > > injection scalable. > > > > > > > > > > > > > > We meant ioapic lock. See the last report from Ralf on this thread. > > > > > > Can we solve the problem by calling ack notifier outside rcu read > > > > > > section in kvm_notify_acked_irq()? > > > > > > > > > > The unregister path does > > > > > > > > > > - remove_from_list(entry) > > > > > - synchronize_rcu > > > > > - kfree(entry) > > > > > > > > > > So if kvm_notify_acked_irq sleeps, synchronize_rcu can succeed, and the > > > > > notifier entry can be freed. > > > > What I mean is kvm_notify_acked_irq() will iterate over all ack entries > > > > in rcu read protected section, but instead of calling callback it will > > > > collect them into the array and call them outside rcu read section. At > > > > this point it doesn't matter if entry is unregistered since the copy is > > > > used to actually call the notifier. > > > > > > Here it is, but no, this trick can't be done safely for ack notifiers > > > because they are unregistered at runtime by device assignment. > > > > > > How does the freeing path knows it can proceed to free its entry if > > > there is no reliable way to know if there is a reference to itself? > > > (think "priv" gets freed between rcu_read_unlock and ->irq_acked with > > > the patch below). > > > > > Yeah, I see :( > > > > > I can convert to SRCU if you have no objections. > > > > > AFAIR there was a disadvantage comparing to RCU, but I don't remember what > > it was exactly. > > http://www.mentby.com/paul-e-mckenney/kvm-patch-v3-13-kvm-fix-race-in-irqrouting-logic.html > > But for KVM IRQ path's it should not matter much since usage of grace > periods is rare (registration/unregistration is very rare compared to > read side), and the IRQ path's are already large and slow, so the added > overhead should not be noticeable. > msix path shouldn't be slow and this is the case we are trying to optimize. SRCU's read-side primitives are also significantly slower than those of RCU. Sounds not too optimistic. > > What about converting PIC/IOAPIC mutexes into spinlocks? > > Works for me, but on large guests the spinning will be noticeable. > I believe. For interrupts going through IOPIC, but we know this is not scalable anyway. > > Avi? -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html