On Wed, Oct 27, 2010 at 11:17:42PM -0600, Alex Williamson wrote: > On Thu, 2010-10-28 at 06:58 +0200, Michael S. Tsirkin wrote: > > On Wed, Oct 27, 2010 at 04:58:20PM -0600, Alex Williamson wrote: > > > On Wed, 2010-10-27 at 14:43 -0700, Etienne Martineau wrote: > > > > On Wed, 27 Oct 2010, Alex Williamson wrote: > > > > > No, emulated devices trigger interrupts directly with qemu_set_irq. > > > > > irqfds are currently only used by vhost afaik, since it's being > > > > > interrupted externally, much like pass through devices are. > > > > > > > > Fair enough. Thanks for the clarification. > > > > > > > > > Sort of. When the VFIO device triggers an interrupt, we get notified > > > > > via the eventfd we've registered for that interrupt. We can then call > > > > > qemu_set_irq directly to raise that interrupt in the KVM kernel APIC. > > > > > That much works today. > > > > > > > > Understood but performance wise this is no good for KVM right? > > > > > > Right, bouncing interrupts and EOIs through qemu via eventfds is going > > > to add latency. On the interrupt path we already have irqfds, which > > > will avoid the bounce through userspace, we just need to use them. > > > Doing something similar with EOIs could avoid that path, giving us > > > something comparable to current device assignment. > > > > > > > > The irqfd mechanism is simply a way for KVM to > > > > > directly consume the eventfd and raise an interrupt via a pre-setup > > > > > vector. That's yet to be implemented for INTx on VFIO, but should > > > > > mostly be a matter of connecting existing pieces together. It's working > > > > > for MSI-X. > > > > > > > > OK, I was on the impression you already had irqfd 'connected' to KVM from > > > > VFIO... This is why I was asking about the nature of the changed in VFIO. > > > > > > > > > When VFIO sends an interrupt, it disables the physical device from > > > > > generating more interrupts (this is where VFIO requires PCI 2.3 > > > > > compliant devices for the INTx disable bit int he status register). > > > > > When the guest services the interrupt, we can detect this by catching > > > > > the EOI of the IOAPIC. At that point, we can re-eanble interrupts on > > > > > the device. Wash, rinse, repeat. > > > > > > > > > > To do this in qemu, I created a callback on the ioapic where drivers can > > > > > register for the interrupt they care about. Since KVM moves the ioapic > > > > > into the kernel, we need to extend this into KVM and have yet another > > > > > eventfd mechanism. It's possible that we could have the VFIO kernel > > > > > module also receive this eventfd, re-enabling interrupts on the device, > > > > > in much the same way as above. > > > > > > > > In the cases of KVM where are you going to catch the EIO? For some > > > > reason I'm on the impression that this is part of KVM. If so then how are > > > > you going to 'signal' to VFIO? Cannot use eventfd here right? > > > > > > KVM already has an internal IRQ ACK notifier (which is what current > > > device assignment uses to do the same thing), it's just a matter of > > > adding a callback that does a kvm_register_irq_ack_notifier that sends > > > off the eventfd signal. I've got this working and will probably send > > > out the KVM patch this week. For now the eventfd goes to userspace, but > > > this is where I imagine we could steal some of the irqfd code to make > > > VFIO consume the irqfd signal directly. Thanks, > > > > > > Alex > > > > BTW, how do we handle sharing the interrupt in guest? > > I'm currently using flags to track whether we've asserted the interrupt > in qemu, and only act on the eoi when the flag is set. In my current > setup, the guest puts the pass through device and USB on the same > interrupt and using this filtering seems to be sufficient. I think this > should act just like bare metal, the device will reassert the interrupt > if it still needs service, but we can avoid obviously gratuitous eois > being passed down to vfio. > > This will complicate having vfio intercept the eoi eventfd directly > since it will then need to track the state too. Another thing I've got > working is letting vfio support older non-PCI-2.3 compliant devices so > long as they can claim an exclusive interrupt (just like current code). > We need to track whether the irq is enabled or disabled for this anyway > so that we don't get unbalanced enabled/disables. > > Alex Tracking state is also good for saving an extra config read on each access. -- MST -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html