On Sun, 2010-07-11 at 22:23 +0300, Michael S. Tsirkin wrote: > On Sun, Jul 11, 2010 at 01:21:18PM -0600, Alex Williamson wrote: > > On Sun, 2010-07-11 at 21:54 +0300, Michael S. Tsirkin wrote: > > > On Sun, Jul 11, 2010 at 09:30:59PM +0300, Avi Kivity wrote: > > > > On 07/11/2010 09:26 PM, Alex Williamson wrote: > > > > >On Sun, 2010-07-11 at 21:14 +0300, Avi Kivity wrote: > > > > >>On 07/11/2010 09:09 PM, Alex Williamson wrote: > > > > >>>For device assignment, we need to know when the VM writes an end > > > > >>>of interrupt to the APIC, which allows us to de-assert the interrupt > > > > >>>line and clear the DisINTx bit. Add a new wrapper for ioapic > > > > >>>generated interrupts with a callback on eoi and create an interface > > > > >>>for drivers to be notified on eoi. > > > > >>> > > > > >>You aren't going to get this with kvm's in-kernel irqchip, so we need a > > > > >>new interface there. > > > > >Registering an eventfd for the eoi seems like a reasonable alternative. > > > > > > > > I'm worried about that racing (with what?) > > > > > > With device asserting the interrupt? > > > Need to make sure that all possible scenarious work well: > > > > > > device asserts interrupt > > > driver clears interrupt > > > device asserts interrupt > > > eoi > > > > > > device asserts interrupt > > > driver clears interrupt > > > eoi > > > device asserts interrupt > > > > > > etc > > > > > > Not that I see issues, these are things we need to check. > > > > I think those are all protected by host and qemu vfio drivers managing > > DisINTx. The way I understand it to work now is: > > > > device asserts interrupt > > interrupt lands in host vfio driver > > host vfio sets DisINTx on the device > > host vfio sends eventfd > > eventfd lands in qemu vfio, does a qemu_set_irq > > ... guest processes > > guest writes eoi to apic, lands back in qemu vfio driver > > qemu vfio deasserts qemu interrupt > > qemu vfio clears DisINTx > > > > So I don't think there's a race as long as ordering is sane for toggling > > DisINTx. Thanks, > > > > Alex > > > > What about threaded interrupts? I think (correct me if I am wrong) > that they work like this: > > device asserts interrupt > guest disables interrupt Is this the guest manipulating DisINTx itself? I suppose it could be a device dependent disable as well. > eoi > guest enables interrupt > driver clears interrupt These two are hopefully reversed or else the driver is expecting to clear and potentially reassert interrupts anyway. > device asserts interrupt > > If so, your code will clear DisINTx immediately which > will always get us another host interrupt: > performance will be hurt. I am also not sure > we'll not lose interrupts. Level interrupts are lossy afaik, if it gets cleared but an interrupt condition still exists, it should be reasserted. > It seems we need to track interrupt disable/enable as well, and only > clear DisINTx after eoi with interrupts enabled. Not sure what is the > interface for this. If a driver uses device dependent code to disable interrupts, there's no issue, we'll clear DisINTx, but the device still won't generate an interrupt until the dependent code is re-enabled by the guest (assuming there's no cross talk between DisINTx and device dependent components). For the case that a guest driver disables via DisINTx, it seems easy to trap and track that. So we get: device asserts interrupt guest disables interrupt (trapped, qemu-vfio sets intx.guest_disabled = 1) eoi (qemu-vfio deasserts qemu interrupts, but because of above doesn't clear DisINTx) guest enables interrupt (allowed to pass through, intx.guest_disabled = 0) driver clears interrupt device asserts interrupt I've already got an intx.pending bit, so I think this just changes the eoi to: vdev->intx.pending = 0; qemu_set_irq(vdev->pdev.irq[vdev->intx.pin], 0); if (!vdev->intx.guest_disabled) { vfio_unmask_intx(vdev); } Writing the command register DisINTx bit then just gets some kind of: if (cmd & PCI_COMMAND_INTX_DISABLE && intx.pending) { intx.guest_disabled = 1; cmd &= ~PCI_COMMAND_INTX_DISABLE; } else if (!(cmd & PCI_COMMAND_INTX_DISABLE) && intx.guest_disabled) { intx.guest_disabled = 0; } ... allow write That work? Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html