On 18/12/2017 10:08, David Hildenbrand wrote: > On 18.12.2017 09:50, Paolo Bonzini wrote: >> On 18/12/2017 09:30, David Hildenbrand wrote: >>> The ugly thing in kvm_irqfd_assign() is that we access irqfd without >>> holding a lock. I think that should rather be fixed than working around >>> that issue. (e.g. lock() -> lookup again -> verify still in list -> >>> unlock()) >> >> I wonder if it's even simpler: >> >> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c >> index f2ac53ab8243..17ed298bd66f 100644 >> --- a/virt/kvm/eventfd.c >> +++ b/virt/kvm/eventfd.c >> @@ -387,7 +387,6 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args) >> >> idx = srcu_read_lock(&kvm->irq_srcu); >> irqfd_update(kvm, irqfd); >> - srcu_read_unlock(&kvm->irq_srcu, idx); >> >> list_add_tail(&irqfd->list, &kvm->irqfds.items); >> >> @@ -420,10 +419,12 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args) >> irqfd->consumer.token, ret); >> } >> #endif >> + srcu_read_unlock(&kvm->irq_srcu, idx); >> > > Was worried about the poll() call. But if that works, it would be very nice. Good point. The poll() call is effectively a callback to irqfd_ptable_queue_proc. So, after the above change, rqfd_wakeup takes irq_srcu inside wqh->lock, while kvm_irqfd_assign would take them in the opposite order. However, this is a read-side critical section so this doesn't cause a deadlock directly. The effect is only that synchronize_srcu would now wait for wqh->lock to be released. The opposite, which *would* cause a deadlock, would be a call to synchronize_srcu while wqh->lock is held. However, this cannot happen because wqh->lock is a spinlock and synchronize_srcu, which sleeps, cannot be called at all while wqh->lock is held. So I think it's okay. Thanks, Paolo > >> return 0; >> >> fail: >> + /* irq_srcu is *not* held here. */ >> if (irqfd->resampler) >> irqfd_resampler_shutdown(irqfd); >> >> >> Thanks, >> >> Paolo >> > >