On Tue, Oct 22, 2024 at 10:32:59AM -0700, Sean Christopherson wrote: > On Fri, Oct 18, 2024, Bernhard Kauer wrote: > > It used a static key to avoid loading the lapic pointer from > > the vcpu->arch structure. However, in the common case the load > > is from a hot cacheline and the CPU should be able to perfectly > > predict it. Thus there is no upside of this premature optimization. > > Do you happen to have performance numbers? Sure. I have some preliminary numbers as I'm still optimizing the round-trip time for tiny virtual machines. A hello-world micro benchmark on my AMD 6850U needs at least 331us. With the static keys it requires 579us. That is a 75% increase. Take the absolute values with a grain of salt as not all of my patches might be applicable to the general case. For the other side I don't have a relevant benchmark yet. But I doubt you would see anything even with a very high IRQ rate. > > The downside is that code patching including an IPI to all CPUs > > is required whenever the first VM without an lapic is created or > > the last is destroyed. > > In practice, this almost never happens though. Do you have a use case for > creating VMs without in-kernel local APICs? I switched from "full irqchip" to "no irqchip" due to a significant performance gain and the simplicity it promised. I might have to go to "split irqchip" mode for performance reasons but I didn't had time to look into it yet. So in the end I assume it will be a trade-off: Do I want to rely on these 3000 lines of kernel code to gain an X% performance increase, or not?