Re: [PATCH] KVM: x86: Move kvm_check_request(KVM_REQ_NMI) after kvm_check_request(KVM_REQ_NMI)

Mingwei Zhang <mizhang@xxxxxxxxxx> · Tue, 26 Sep 2023 21:15:12 -0700

ah, typo in the subject: The 2nd KVM_REQ_NMI should be KVM_REQ_PMI.
Sorry about that.

On Tue, Sep 26, 2023 at 9:09 PM Mingwei Zhang <mizhang@xxxxxxxxxx> wrote:
>
> Move kvm_check_request(KVM_REQ_NMI) after kvm_check_request(KVM_REQ_NMI).
> When vPMU is active use, processing each KVM_REQ_PMI will generate a
> KVM_REQ_NMI. Existing control flow after KVM_REQ_PMI finished will fail the
> guest enter, jump to kvm_x86_cancel_injection(), and re-enter
> vcpu_enter_guest(), this wasted lot of cycles and increase the overhead for
> vPMU as well as the virtualization.
>
> So move the code snippet of kvm_check_request(KVM_REQ_NMI) to make KVM
> runloop more efficient with vPMU.
>
> To evaluate the effectiveness of this change, we launch a 8-vcpu QEMU VM on
> an Intel SPR CPU. In the VM, we run perf with all 48 events Intel vtune
> uses. In addition, we use SPEC2017 benchmark programs as the workload with
> the setup of using single core, single thread.
>
> At the host level, we probe the invocations to vmx_cancel_injection() with
> the following command:
>
>     $ perf probe -a vmx_cancel_injection
>     $ perf stat -a -e probe:vmx_cancel_injection -I 10000 # per 10 seconds
>
> The following is the result that we collected at beginning of the spec2017
> benchmark run (so mostly for 500.perlbench_r in spec2017). Kindly forgive
> the incompleteness.
>
> On kernel without the change:
>     10.010018010              14254      probe:vmx_cancel_injection
>     20.037646388              15207      probe:vmx_cancel_injection
>     30.078739816              15261      probe:vmx_cancel_injection
>     40.114033258              15085      probe:vmx_cancel_injection
>     50.149297460              15112      probe:vmx_cancel_injection
>     60.185103088              15104      probe:vmx_cancel_injection
>
> On kernel with the change:
>     10.003595390                 40      probe:vmx_cancel_injection
>     20.017855682                 31      probe:vmx_cancel_injection
>     30.028355883                 34      probe:vmx_cancel_injection
>     40.038686298                 31      probe:vmx_cancel_injection
>     50.048795162                 20      probe:vmx_cancel_injection
>     60.069057747                 19      probe:vmx_cancel_injection
>
> From the above, it is clear that we save 1500 invocations per vcpu per
> second to vmx_cancel_injection() for workloads like perlbench.
>
> Signed-off-by: Mingwei Zhang <mizhang@xxxxxxxxxx>
> ---
>  arch/x86/kvm/x86.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 42a4e8f5e89a..302b6f8ddfb1 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10580,12 +10580,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>                 if (kvm_check_request(KVM_REQ_SMI, vcpu))
>                         process_smi(vcpu);
>  #endif
> -               if (kvm_check_request(KVM_REQ_NMI, vcpu))
> -                       process_nmi(vcpu);
>                 if (kvm_check_request(KVM_REQ_PMU, vcpu))
>                         kvm_pmu_handle_event(vcpu);
>                 if (kvm_check_request(KVM_REQ_PMI, vcpu))
>                         kvm_pmu_deliver_pmi(vcpu);
> +               if (kvm_check_request(KVM_REQ_NMI, vcpu))
> +                       process_nmi(vcpu);
>                 if (kvm_check_request(KVM_REQ_IOAPIC_EOI_EXIT, vcpu)) {
>                         BUG_ON(vcpu->arch.pending_ioapic_eoi > 255);
>                         if (test_bit(vcpu->arch.pending_ioapic_eoi,
>
> base-commit: 73554b29bd70546c1a9efc9c160641ef1b849358
> --
> 2.42.0.515.g380fc7ccd1-goog
>