On Thu, Oct 31, 2024, Bernhard Kauer wrote: > > > > In practice, this almost never happens though. Do you have a use case for > > > > creating VMs without in-kernel local APICs? > > > > > > I switched from "full irqchip" to "no irqchip" due to a significant > > > performance gain > > > > Signifcant performance gain for what path? I'm genuinely curious. > > I have this really slow PREEMPT_RT kernel (Debian 6.11.4-rt-amd64). > The hello-world benchmark takes on average 100ms. With IRQCHIP it goes > up to 220ms. An strace gives 83ms for the extra ioctl: > > ioctl(4, KVM_CREATE_IRQCHIP, 0) = 0 <0.083242> > > My current theory is that RCU takes ages on this kernel. And creating an > IOAPIC uses SRCU to synchronize the bus array... > > However, in my latest benchmark runs the overhead for IRQCHIP is down to 15 > microseconds. So no big deal anymore. Assuming you're running a recent kernel, that's likely thanks to commit fbe4a7e881d4 ("KVM: Setup empty IRQ routing when creating a VM"). > > Unless your VM doesn't need a timer and doesn't need interrupts of > > any kind, emulating the local APIC in userspace is going to be much > > less performant. > > Do you have any performance numbers? Heh, nope. I actually tried to grab some, mostly out of curiosity again, but recent (last few years) versions of QEMU don't even support a userspace APIC. A single EOI is a great example though. On a remotely modern CPU, an in-kernel APIC allows KVM to enable hardware acceleration so that the EOI is virtualized by hardware, i.e. doesn't take a VM-Exit and so the latency is basically the same as a native EOI (tens of cycles, maybe less). With a userspace APIC, the roundtrip to userspace to emulate the EOI is measured in tens of thousands of cycles. IIRC, last I played around with userspace exits the average turnaround time was ~50k cycles.