> On Nov 29, 2022, at 2:47 PM, Maxim Levitsky <mlevitsk@xxxxxxxxxx> wrote: > > On Tue, 2022-11-29 at 13:22 -0500, Jon Kohler wrote: >> Set vcpu->mode to EXITING_GUEST_MODE as soon vCPU exits to reflect >> that we are indeed exiting guest mode, but not quite out of guest >> mode yet. Note: This is done lazily without an explicit memory >> barrier so that we do not regress the cost in the critical path >> of going from the exit to the exit handler. >> >> Flip back to IN_GUEST_MODE for exits that use >> EXIT_FASTPATH_REENTER_GUEST, such that we are IN_GUEST_MODE upon >> reentry. >> >> Changing vcpu->mode away from IN_GUEST_MODE as early as possible >> gives IPI senders as much runway as possible to avoid ringing >> doorbell or sending posted interrupt IPI in AMD and Intel, >> respectively. Since this is done without an explicit memory >> barrier, the worst case is that the IPI sender sees IN_GUEST_MODE >> still and sends a spurious event, which is the behavior prior >> to this patch. > > Beware that we had a king sized bug in regard to AVIC inhibition races > vs guest entries, this this should be carefully checked for this. Thanks, Maxim - any pointers on what we should be looking for here? > > Also, do you have any perf numbers to see if that actually improves performance? > (I am just curious, I do think that this can improve performance). > Yes indeed! Sorry I should have put that right in the commit msg as a note, but using the kvm-unit-tests vmexit_ipi with -smp 20 on an Intel 8168 its roughly ~3% better (~3325-ish vs ~3400-ish), though the test is a bit noisy even with taskset to a single socket. To help validate that we were even getting *any* benefit, in a local build I added a log statement (rough code below) to IPI delivery path, and did see many, many IPIs getting suppressed that would have otherwise fired. kvm_vcpu_trigger_posted_interrupt() { ... if (vcpu->mode == EXITING_GUEST_MODE) { pr_warn_ratelimited("exiting suppression worked"); } ... } > Best regards, > Maxim Levitsky > > >> >> Signed-off-by: Jon Kohler <jon@xxxxxxxxxxx> >> --- >> arch/x86/kvm/svm/svm.c | 7 +++++++ >> arch/x86/kvm/vmx/vmx.c | 23 +++++++++++++++++++++++ >> arch/x86/kvm/x86.c | 8 ++++++++ >> 3 files changed, 38 insertions(+) >> >> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c >> index ce362e88a567..5f0c118a3ffd 100644 >> --- a/arch/x86/kvm/svm/svm.c >> +++ b/arch/x86/kvm/svm/svm.c >> @@ -3907,6 +3907,13 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in >> else >> __svm_vcpu_run(svm, spec_ctrl_intercepted); >> >> + /* Optimize IPI reduction by setting mode immediately after vmexit >> + * without a memmory barrier as this as not paired anywhere. vcpu->mode >> + * is will be set to OUTSIDE_GUEST_MODE in x86 common code with a memory >> + * barrier, after the host is done fully restoring various host states. >> + */ >> + vcpu->mode = EXITING_GUEST_MODE; >> + >> guest_state_exit_irqoff(); >> } >> >> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c >> index 63247c57c72c..243dcb87c727 100644 >> --- a/arch/x86/kvm/vmx/vmx.c >> +++ b/arch/x86/kvm/vmx/vmx.c >> @@ -5878,6 +5878,17 @@ static fastpath_t handle_fastpath_preemption_timer(struct kvm_vcpu *vcpu) >> >> if (!vmx->req_immediate_exit && >> !unlikely(vmx->loaded_vmcs->hv_timer_soft_disabled)) { >> + /* Reset IN_GUEST_MODE since we're going to reenter >> + * guest as part of this fast path. This is done as >> + * an optimization without a memory barrier since >> + * EXITING_GUEST_MODE is also set without a memory >> + * barrier. This also needs to be reset prior to >> + * calling apic_timer_expired() so that >> + * kvm_use_posted_timer_interrupt() returns the proper >> + * value. >> + */ >> + if (vcpu->mode == EXITING_GUEST_MODE) >> + vcpu->mode = IN_GUEST_MODE; >> kvm_lapic_expired_hv_timer(vcpu); >> return EXIT_FASTPATH_REENTER_GUEST; >> } >> @@ -7031,6 +7042,18 @@ void noinstr vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp) >> void noinstr vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx, >> unsigned int flags) >> { >> + struct kvm_vcpu *vcpu = &vmx->vcpu; >> + >> + /* Optimize IPI reduction by setting mode immediately after vmexit >> + * without a memmory barrier as this as not paired anywhere. vcpu->mode >> + * is will be set to OUTSIDE_GUEST_MODE in x86 common code with a memory >> + * barrier, after the host is done fully restoring various host states. >> + * Since the rdmsr and wrmsr below are expensive, this must be done >> + * first, so that the IPI suppression window covers the time dealing >> + * with fixing up SPEC_CTRL. >> + */ >> + vcpu->mode = EXITING_GUEST_MODE; >> + >> u64 hostval = this_cpu_read(x86_spec_ctrl_current); >> >> if (!cpu_feature_enabled(X86_FEATURE_MSR_SPEC_CTRL)) >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 2835bd796639..0e0d228f3fa5 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -2160,6 +2160,14 @@ fastpath_t handle_fastpath_set_msr_irqoff(struct kvm_vcpu *vcpu) >> data = kvm_read_edx_eax(vcpu); >> if (!handle_fastpath_set_tscdeadline(vcpu, data)) { >> kvm_skip_emulated_instruction(vcpu); >> + /* Reset IN_GUEST_MODE since we're going to reenter >> + * guest as part of this fast path. This is done as >> + * an optimization without a memory barrier since >> + * EXITING_GUEST_MODE is also set without a memory >> + * barrier. >> + */ >> + if (vcpu->mode == EXITING_GUEST_MODE) >> + vcpu->mode = IN_GUEST_MODE; >> ret = EXIT_FASTPATH_REENTER_GUEST; >> } >> break;