Thoughts on (not) expanding the KVM IRQ routing table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I don't think we need to expand the KVM IRQ routing table past 4096
entries just to allow guests to have more PCI devices and more MSIs.

AFAICT there's no good reason why each incoming (usually from VFIO)
eventfd has to be given a GSI# and then that GSI# is then looked up in
the huge KVM IRQ routing table.

Could we just embed a struct kvm_kernel_irq_routing_entry directly into
the struct kvm_kernel_irqfd, and set its ->irq_entry to point to that
in the "direct no GSI#" case? 

In irqfd_wakeup() we could call kvm_arch_set_irq_inatomic() with that
directly (which we already do; we just need to avoid the seqcount
handling and lookup in the gsi==0 case).

If kvm_arch_set_irq_inatomic() returns -EWOULDBLOCK then we end up in
irqfd_inject() inthe wait queue, which can ->irq_entry->set() function
if gsi==0, instead of using kvm_set_irq as it does.

We add a new ioctl KVM_IRQFD_DIRECT, which takes an eventfd and a
struct kvm_irq_routing_entry, and bypasses the GSI numbering and the
lookups, and calls kvm_set_routing_entry() to set up its embedded entry
from whatever the user provided.

I think we could fairly quickly get to a point where a KVM selftest
could deliver an event to a guest without having to mess with the
routing table.

(To start with, we'd not register the IRQ bypass consumer for these.
That's fairly easily fixable, but takes a little more typing and cross-
platform testing.)

I think it's reasonable to declare that there can be only *one* target
(per kvm) for a given eventfd. So deassign and modification could walk
a list just as kvm_irqfd_deassign() does right now, checking for
irqfd->eventfd == eventfd as it does, but just not caring about gsi#.

If you really want to get clever about the scalability, you stop using
a *list* for kvm->irqfds.items and you use a saner data structure like
a tree instead. 

Or — at the risk of PeterZ coming across the North Sea and hurting me
for suggesting it — you just look through the entries on the eventfd's
waitq list, looking for one where the ->func is irqfd_wakeup, and then
you can find the already-attached struct kvm_kernel_irqfd directly
given only the eventfd without a separate structure at all.


Attachment: smime.p7s
Description: S/MIME cryptographic signature


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux