On Mon, 2012-06-18 at 11:02 +0300, Avi Kivity wrote: > On 06/16/2012 07:34 PM, Alex Williamson wrote: > > I'm looking for opinions on this approach. For vfio device assignment > > we minimally need a way to get EOIs from the in-kernel irqchip out to > > userspace. Getting that out via an eventfd would allow us to bounce > > all level interrupts out to userspace, where we would de-assert the > > device interrupt in qemu and unmask the physical device. Ideally we > > could deassert the interrupt in KVM, which allows us to send the EOI > > directly to vfio. To do that, we need to use a new IRQ source ID so > > the guest sees the logical OR of qemu requested state and external > > device state. This allows external devices to share interrupts with > > emulated devices, just like KVM assignment. That means the means we > > also need to use a new source ID when injecting the interrupt via > > irqfd. > > > > Rather than creating a source ID allocation interface, extending irqfd > > to support a user supplied source ID, and creating another new > > interface to get the EOI out, I think it works out better to bundle > > these all together as just a level irqfd interface. This way we don't > > allow users to create unbalanced states where a level interrupt is > > asserted, but has no way of being deasserted. I believe the below > > does this, but needs testing and validation with an implementation in > > qemu. > > > > Missing documentation, which helps at least one reviewer review. It's > not just for a commit. Yeah, sorry. There's no irqfd documentation and I didn't a chance to write it so I could add my blurb to it. > > +static void > > +irqfd_inject_level(struct work_struct *work) > > +{ > > + struct _irqfd *irqfd = container_of(work, struct _irqfd, inject); > > + > > + kvm_set_irq(irqfd->kvm, irqfd->irq_source_id, irqfd->gsi, 1); > > +} > > + > > +static void > > +irqfd_ack_level(struct kvm_irq_ack_notifier *notifier) > > +{ > > + struct _irqfd *irqfd = container_of(notifier, struct _irqfd, notifier); > > + > > + kvm_set_irq(irqfd->kvm, irqfd->irq_source_id, irqfd->gsi, 0); > > + eventfd_signal(irqfd->eoi_eventfd, 1); > > +} > > + > > I don't understand how this works. A level IRQ isn't de-asserted by the > EOI, it's de-asserted by its source. > > Consider the following sequence: > > device guest > > event > assert > interrupt > interrupt handler > handle event > clear ISR bit > deassert > event > assert > EOI > > What should happen is that the interrupt will be redelivered > immmediately after the EOI, but that won't happen with your API since > the EOI ack notifier will deassert the interrupt and nothing will > re-assert it. A device that can de-assert it's own interrupt based on register/config access probably isn't going to subscribe to this interface. Such a device already has much greater visibility of it's service needs. In the case where we have external devices, they'll reassert the interrupt, which is part of this api that will need to be documented. Comparing this to real hardware, an eoi causes the ioapic to reassert the interrupt if any of it's inputs are still active. Here we preemptively de-assert our interrupt and require the user of the interface to re-assert. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html