KVM: for the scope On Thu, Mar 21, 2024, Daniil Tatianin wrote: > We would like to be able to create large VMs (up to 224 vCPUs atm) with > up to 128 virtio-net cards, where each card needs a TX+RX queue per vCPU > for optimal performance (as well as config & control interrupts per > card). Adding in extra virtio-blk controllers with a queue per vCPU (up > to 192 disks) yields a total of about ~100k IRQ routes, rounded up to > 128k for extra headroom and flexibility. > > The current limit of 4096 was set in 2018 and is too low for modern > demands. It also seems to be there for no good reason as routes are > allocated lazily by the kernel anyway (depending on the largest GSI > requested by the VM). > > Signed-off-by: Daniil Tatianin <d-tatianin@xxxxxxxxxxxxxx> > --- > include/linux/kvm_host.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 48f31dcd318a..10a141add2a8 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -2093,7 +2093,7 @@ static inline bool mmu_invalidate_retry_gfn_unsafe(struct kvm *kvm, > > #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING > > -#define KVM_MAX_IRQ_ROUTES 4096 /* might need extension/rework in the future */ > +#define KVM_MAX_IRQ_ROUTES 131072 /* might need extension/rework in the future */ I am not comfortable simply bumping the max. Yeah, it's allocated on-demand, but if my math is correct, the means a max of ~8MiB for the table, plus another 8MiB for the tnries. And when handling KVM_SET_GSI_ROUTING, KVM will have 2 tables (old and new), and another 4MiB for duplicating the userspace array. Those allocations are accounted, but that's still a lot of potential thrash. And KVM's handling is also grossly inefficient, e.g. reallocating everything just to change one routing entry is awful. Maybe painfully slow updates are fine for your use case, but some OSes have a bad habit of round-robining IRQ destinations on a regular basis. So it might be "free" in the sense that it costs you nothing to get your use case working, but there's very much a cost for KVM in the form of technical debt that someone will have to eventually pay for. I don't have any concrete thoughts on how to make KVM's implementation less sucky, but I do think we need to give it some attention before increasing the maximum number of IRQ routes, especially before increasing it by 32x.