On Wed, Oct 28, 2015 at 01:44:55AM +0100, Paolo Bonzini wrote: > > > On 27/10/2015 22:26, Yunhong Jiang wrote: > >> > On RT kernels however can you call eventfd_signal from interrupt > >> > context? You cannot call spin_lock_irqsave (which can sleep) from a > >> > non-threaded interrupt handler, can you? You would need a raw spin lock. > > Thanks for pointing this out. Yes, we can't call spin_lock_irqsave on RT > > kernel. Will do this way on next patch. But not sure if it's overkill to use > > raw_spinlock there since the eventfd_signal is used by other caller also. > > No, I don't think you can use raw_spinlock there. The problem is not > just eventfd_signal, it is especially wake_up_locked_poll. You cannot > convert the whole workqueue infrastructure to use raw_spinlock. You mean the waitqueue, instead of workqueue, right? One choice is to change the eventfd to use simple wait queue, which is raw_spinlock. But use simple waitqueue on eventfd may in fact impact real time latency if not in this scenario. > > Alex, would it make sense to use the IRQ bypass infrastructure always, > not just for VT-d, to do the MSI injection directly from the VFIO > interrupt handler and bypass the eventfd? Basically this would add an > RCU-protected list of consumers matching the token to struct > irq_bypass_producer, and a > > int (*inject)(struct irq_bypass_consumer *); > > callback to struct irq_bypass_consumer. If any callback returns true, > the eventfd is not signaled. The KVM implementation would be like this > (compare with virt/kvm/eventfd.c): > > /* Extracted out of irqfd_wakeup */ > static int > irqfd_wakeup_pollin(struct kvm_kernel_irqfd *irqfd) > { > ... > } > > /* Extracted out of irqfd_wakeup */ > static int > irqfd_wakeup_pollhup(struct kvm_kernel_irqfd *irqfd) > { > ... > } > > static int > irqfd_wakeup(wait_queue_t *wait, unsigned mode, int sync, > void *key) > { > struct _irqfd *irqfd = container_of(wait, > struct _irqfd, wait); > unsigned long flags = (unsigned long)key; > > if (flags & POLLIN) > irqfd_wakeup_pollin(irqfd); > if (flags & POLLHUP) > irqfd_wakeup_pollhup(irqfd); > > return 0; > } > > static int kvm_arch_irq_bypass_inject( > struct irq_bypass_consumer *cons) > { > struct kvm_kernel_irqfd *irqfd = > container_of(cons, struct kvm_kernel_irqfd, > consumer); > > irqfd_wakeup_pollin(irqfd); > } > This is a good idea IMHO. So for MSI interrupt, the kvm_arch_irq_bypass_inject will be used, and the irqfd_wakeup will not be invoked anymore, am I right? I noticed the irq bypass manager is not merged yet, are there any git branch for it? > Or do you think it would be a hack? The latency improvement might > actually be even better than what Yunhong is already reporting. I will be glad to try it. Thanks --jyh > > Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html