It's enough to check the exit value and issue a direct call to avoid the retpoline for all the common vmexit reasons. Reducing this list to only EXIT_REASON_MSR_WRITE, EXIT_REASON_PREEMPTION_TIMER, EXIT_REASON_EPT_MISCONFIG, EXIT_REASON_IO_INSTRUCTION increases the computation time of the hrtimer guest testcase on Haswell i5-4670T CPU @ 2.30GHz by 7% with the default spectre v2 mitigation enabled in the host and guest. On skylake as opposed there's no measurable difference with the short list. To put things in prospective on Haswell the same hrtimer workload (note: it never calls cpuid and it never attempts to trigger more vmexit on purpose) in guest takes 16.3% longer to compute on upstream KVM running in the host than with the KVM mono v1 patchset applied to the host kernel, while on skylake the same takes only 5.4% more time (both with the default mitigations enabled in guest and host). It's also unclear why EXIT_REASON_IO_INSTRUCTION should be included. Of course CONFIG_RETPOLINE already forbids gcc not to do indirect jumps while compiling all switch() statements, however switch() would still allow the compiler to bisect the value, however it seems to run slower if something and the reason is that it's better to prioritize and do the minimal possible number of checks for the most common vmexit. The halt and pause loop exiting may be slow paths from the point of the guest, but not necessarily so from the point of the host. There can be a flood of halt exit reasons (in fact that's why the cpuidle guest haltpoll support was recently merged and we can't rely on it here because there are older kernels and other OS that must also perform optimally). All it takes is a pipe ping pong with a different host CPU and the host CPUs running at full capacity. The same consideration applies to the pause loop exiting exit reason, if there's heavy host overcommit that collides heavily in a spinlock the same may happen. In the common case of a fully idle host, the halt and pause loop exiting can't help, but adding them doesn't hurt the common case and the expectation here is that if they would ever become measurable, it would be because they are increasing (and not decreasing) performance. Signed-off-by: Andrea Arcangeli <aarcange@xxxxxxxxxx> --- arch/x86/kvm/vmx/vmx.c | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index de3ae2246205..2bd57a7d2be1 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5846,9 +5846,29 @@ int kvm_x86_handle_exit(struct kvm_vcpu *vcpu) } if (exit_reason < kvm_vmx_max_exit_handlers - && kvm_vmx_exit_handlers[exit_reason]) + && kvm_vmx_exit_handlers[exit_reason]) { +#ifdef CONFIG_RETPOLINE + if (exit_reason == EXIT_REASON_MSR_WRITE) + return kvm_emulate_wrmsr(vcpu); + else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER) + return handle_preemption_timer(vcpu); + else if (exit_reason == EXIT_REASON_PENDING_INTERRUPT) + return handle_interrupt_window(vcpu); + else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT) + return handle_external_interrupt(vcpu); + else if (exit_reason == EXIT_REASON_HLT) + return kvm_emulate_halt(vcpu); + else if (exit_reason == EXIT_REASON_PAUSE_INSTRUCTION) + return handle_pause(vcpu); + else if (exit_reason == EXIT_REASON_MSR_READ) + return kvm_emulate_rdmsr(vcpu); + else if (exit_reason == EXIT_REASON_CPUID) + return kvm_emulate_cpuid(vcpu); + else if (exit_reason == EXIT_REASON_EPT_MISCONFIG) + return handle_ept_misconfig(vcpu); +#endif return kvm_vmx_exit_handlers[exit_reason](vcpu); - else { + } else { vcpu_unimpl(vcpu, "vmx: unexpected exit reason 0x%x\n", exit_reason); dump_vmcs();