On Sun, Jul 11, 2010 at 02:03:34PM -0600, Alex Williamson wrote: > On Sun, 2010-07-11 at 22:23 +0300, Michael S. Tsirkin wrote: > > On Sun, Jul 11, 2010 at 01:21:18PM -0600, Alex Williamson wrote: > > > On Sun, 2010-07-11 at 21:54 +0300, Michael S. Tsirkin wrote: > > > > On Sun, Jul 11, 2010 at 09:30:59PM +0300, Avi Kivity wrote: > > > > > On 07/11/2010 09:26 PM, Alex Williamson wrote: > > > > > >On Sun, 2010-07-11 at 21:14 +0300, Avi Kivity wrote: > > > > > >>On 07/11/2010 09:09 PM, Alex Williamson wrote: > > > > > >>>For device assignment, we need to know when the VM writes an end > > > > > >>>of interrupt to the APIC, which allows us to de-assert the interrupt > > > > > >>>line and clear the DisINTx bit. Add a new wrapper for ioapic > > > > > >>>generated interrupts with a callback on eoi and create an interface > > > > > >>>for drivers to be notified on eoi. > > > > > >>> > > > > > >>You aren't going to get this with kvm's in-kernel irqchip, so we need a > > > > > >>new interface there. > > > > > >Registering an eventfd for the eoi seems like a reasonable alternative. > > > > > > > > > > I'm worried about that racing (with what?) > > > > > > > > With device asserting the interrupt? > > > > Need to make sure that all possible scenarious work well: > > > > > > > > device asserts interrupt > > > > driver clears interrupt > > > > device asserts interrupt > > > > eoi > > > > > > > > device asserts interrupt > > > > driver clears interrupt > > > > eoi > > > > device asserts interrupt > > > > > > > > etc > > > > > > > > Not that I see issues, these are things we need to check. > > > > > > I think those are all protected by host and qemu vfio drivers managing > > > DisINTx. The way I understand it to work now is: > > > > > > device asserts interrupt > > > interrupt lands in host vfio driver > > > host vfio sets DisINTx on the device > > > host vfio sends eventfd > > > eventfd lands in qemu vfio, does a qemu_set_irq > > > ... guest processes > > > guest writes eoi to apic, lands back in qemu vfio driver > > > qemu vfio deasserts qemu interrupt > > > qemu vfio clears DisINTx > > > > > > So I don't think there's a race as long as ordering is sane for toggling > > > DisINTx. Thanks, > > > > > > Alex > > > > > > > What about threaded interrupts? I think (correct me if I am wrong) > > that they work like this: > > > > device asserts interrupt > > guest disables interrupt > > Is this the guest manipulating DisINTx itself? I suppose it could be a > device dependent disable as well. It can manipulate it, so we need to virtualize it, but that's a separate issue. > > eoi > > guest enables interrupt > > driver clears interrupt > > These two are hopefully reversed or else the driver is expecting to > clear and potentially reassert interrupts anyway. Yes. Sorry. > > device asserts interrupt > > > > If so, your code will clear DisINTx immediately which > > will always get us another host interrupt: > > performance will be hurt. I am also not sure > > we'll not lose interrupts. > > Level interrupts are lossy afaik, if it gets cleared but an interrupt > condition still exists, it should be reasserted. Yes but I mean we won't interrupt the guest. So it wil lstay disabled forever. > > It seems we need to track interrupt disable/enable as well, and only > > clear DisINTx after eoi with interrupts enabled. Not sure what is the > > interface for this. > > If a driver uses device dependent code to disable interrupts, > there's no > issue, we'll clear DisINTx, but the device still won't generate an > interrupt until the dependent code is re-enabled by the guest (assuming > there's no cross talk between DisINTx and device dependent components). > > For the case that a guest driver disables via DisINTx, it seems easy to > trap and track that. So we get: > > device asserts interrupt > guest disables interrupt > (trapped, qemu-vfio sets intx.guest_disabled = 1) > eoi > (qemu-vfio deasserts qemu interrupts, but because of above doesn't clear DisINTx) > guest enables interrupt > (allowed to pass through, intx.guest_disabled = 0) > driver clears interrupt > device asserts interrupt > > I've already got an intx.pending bit, so I think this just changes the eoi to: > > vdev->intx.pending = 0; > qemu_set_irq(vdev->pdev.irq[vdev->intx.pin], 0); > if (!vdev->intx.guest_disabled) { > vfio_unmask_intx(vdev); > } > > Writing the command register DisINTx bit then just gets some kind of: > > if (cmd & PCI_COMMAND_INTX_DISABLE && intx.pending) { > intx.guest_disabled = 1; > cmd &= ~PCI_COMMAND_INTX_DISABLE; > } else if (!(cmd & PCI_COMMAND_INTX_DISABLE) && intx.guest_disabled) { > intx.guest_disabled = 0; > } > ... allow write > > That work? Thanks, > > Alex No, I mean guest OS disables the specific interrupt with disable_irq. -- MST -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html