On Thu, 2010-10-28 at 06:58 +0200, Michael S. Tsirkin wrote: > On Wed, Oct 27, 2010 at 04:58:20PM -0600, Alex Williamson wrote: > > On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote: > > > On Wed, 27 Oct 2010, Alex Williamson wrote: > > > > No, emulated devices trigger interrupts directly with qemu_set_irq. > > > > irqfds are currently only used by vhost afaik, since it's being > > > > interrupted externally, much like pass through devices are. > > > > > > Fair enough. Thanks for the clarification. > > > > > > > Sort of. When the VFIO device triggers an interrupt, we get notified > > > > via the eventfd we've registered for that interrupt. We can then call > > > > qemu_set_irq directly to raise that interrupt in the KVM kernel APIC. > > > > That much works today. > > > > > > Understood but performance wise this is no good for KVM right? > > > > Right, bouncing interrupts and EOIs through qemu via eventfds is going > > to add latency. On the interrupt path we already have irqfds, which > > will avoid the bounce through userspace, we just need to use them. > > Doing something similar with EOIs could avoid that path, giving us > > something comparable to current device assignment. > > > > > > The irqfd mechanism is simply a way for KVM to > > > > directly consume the eventfd and raise an interrupt via a pre-setup > > > > vector. That's yet to be implemented for INTx on VFIO, but should > > > > mostly be a matter of connecting existing pieces together. It's working > > > > for MSI-X. > > > > > > OK, I was on the impression you already had irqfd 'connected' to KVM from > > > VFIO... This is why I was asking about the nature of the changed in VFIO. > > > > > > > When VFIO sends an interrupt, it disables the physical device from > > > > generating more interrupts (this is where VFIO requires PCI 2.3 > > > > compliant devices for the INTx disable bit int he status register). > > > > When the guest services the interrupt, we can detect this by catching > > > > the EOI of the IOAPIC. At that point, we can re-eanble interrupts on > > > > the device. Wash, rinse, repeat. > > > > > > > > To do this in qemu, I created a callback on the ioapic where drivers can > > > > register for the interrupt they care about. Since KVM moves the ioapic > > > > into the kernel, we need to extend this into KVM and have yet another > > > > eventfd mechanism. It's possible that we could have the VFIO kernel > > > > module also receive this eventfd, re-enabling interrupts on the device, > > > > in much the same way as above. > > > > > > In the cases of KVM where are you going to catch the EIO? For some > > > reason I'm on the impression that this is part of KVM. If so then how are > > > you going to 'signal' to VFIO? Cannot use eventfd here right? > > > > KVM already has an internal IRQ ACK notifier (which is what current > > device assignment uses to do the same thing), it's just a matter of > > adding a callback that does a kvm_register_irq_ack_notifier that sends > > off the eventfd signal. I've got this working and will probably send > > out the KVM patch this week. For now the eventfd goes to userspace, but > > this is where I imagine we could steal some of the irqfd code to make > > VFIO consume the irqfd signal directly. Thanks, > > > > Alex > > BTW, how do we handle sharing the interrupt in guest? I'm currently using flags to track whether we've asserted the interrupt in qemu, and only act on the eoi when the flag is set. In my current setup, the guest puts the pass through device and USB on the same interrupt and using this filtering seems to be sufficient. I think this should act just like bare metal, the device will reassert the interrupt if it still needs service, but we can avoid obviously gratuitous eois being passed down to vfio. This will complicate having vfio intercept the eoi eventfd directly since it will then need to track the state too. Another thing I've got working is letting vfio support older non-PCI-2.3 compliant devices so long as they can claim an exclusive interrupt (just like current code). We need to track whether the irq is enabled or disabled for this anyway so that we don't get unbalanced enabled/disables. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html