Here's the much anticipated re-write of support for level irqfds. As Michael suggested, I've rolled the eoi/ack notification fd into KVM_IRQFD as a new mode. For lack of a better name, as there seems to be objections to associating this specifically with an EOI or an ACK, I've name this OADN or "On Ack, De-assert & Notify". Patch 1of2 switches current KVM_IRQFDs to use their own IRQ source ID since we're potentially stepping on KVM_USERSPACE_IRQ_SOURCE_ID. Unfurtunately I was not able to make 2of2 use a single IRQ source ID, the reason is it's racy. Objects to track OADNs are made dynamically, we look through existing ones for a match under spinlock and setup a new one if there's no match. On teardown, we can remove the OADN from the list under lock, but that same lock prevents us from de-assigning the IRQ ACK notifier or waiting for an RCU grace period. We must make sure that any unused GSI is de-asserted, but the above means it's possible that another OADN has been created for this source ID/GSI and de-asserting the GSI could lead to breakage. Instead each OADN object gets it's own source ID, but these are all shared by users of the same GSI. So for PCI devices, we might have up to 4 IRQ source IDs allocated. Michael had also suggested avoiding reference counting and using list_empty for this OADN object. Unfortunately, that doesn't work for similar reasons. We want to release the OADN object underlock, preventing others from re-using it on the free path, but in order to have lock-less de-assert & notify we use RCU, meaning we can't trust list_empty until after an RCU grace period, which must be done outside of spinlocks. If there are suggestions how we can handle these better, please make them, but I think this compromise is race-free and still manages to make allocation of IRQ source IDs mostly a non-issue for device assignment limits. Thanks, Alex --- Alex Williamson (2): kvm: On Ack, De-assert & Notify KVM_IRQFD extension kvm: Use a reserved IRQ source ID for irqfd Documentation/virtual/kvm/api.txt | 13 ++ arch/x86/kvm/x86.c | 4 + include/linux/kvm.h | 7 + include/linux/kvm_host.h | 2 virt/kvm/eventfd.c | 199 ++++++++++++++++++++++++++++++++++++- 5 files changed, 218 insertions(+), 7 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html