On Fri, 13 Mar 2020 at 11:23, Xu, Like <like.xu@xxxxxxxxx> wrote: > > Hi Wanpeng, > > On 2020/3/12 19:05, Wanpeng Li wrote: > > On Thu, 12 Mar 2020 at 18:36, Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> wrote: > >> Wanpeng Li <kernellwp@xxxxxxxxx> writes: > >> > >>> From: Wanpeng Li <wanpengli@xxxxxxxxxxx> > >>> > >>> PMU is not exposed to guest by most of cloud providers since the bad performance > >>> of PMU emulation and security concern. However, it calls perf_guest_switch_get_msrs() > >>> and clear_atomic_switch_msr() unconditionally even if PMU is not exposed to the > >>> guest before each vmentry. > >>> > >>> ~1.28% vmexit time reduced can be observed by kvm-unit-tests/vmexit.flat on my > >>> SKX server. > >>> > >>> Before patch: > >>> vmcall 1559 > >>> > >>> After patch: > >>> vmcall 1539 > >>> > >>> Signed-off-by: Wanpeng Li <wanpengli@xxxxxxxxxxx> > >>> --- > >>> arch/x86/kvm/vmx/vmx.c | 3 +++ > >>> 1 file changed, 3 insertions(+) > >>> > >>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > >>> index 40b1e61..fd526c8 100644 > >>> --- a/arch/x86/kvm/vmx/vmx.c > >>> +++ b/arch/x86/kvm/vmx/vmx.c > >>> @@ -6441,6 +6441,9 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx) > >>> int i, nr_msrs; > >>> struct perf_guest_switch_msr *msrs; > >>> > >>> + if (!vcpu_to_pmu(&vmx->vcpu)->version) > >>> + return; > >>> + > >>> msrs = perf_guest_get_msrs(&nr_msrs); > >>> > >>> if (!msrs) > >> Personally, I'd prefer this to be expressed as > >> > >> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > >> index 40b1e6138cd5..ace92076c90f 100644 > >> --- a/arch/x86/kvm/vmx/vmx.c > >> +++ b/arch/x86/kvm/vmx/vmx.c > >> @@ -6567,7 +6567,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) > >> > >> pt_guest_enter(vmx); > >> > >> - atomic_switch_perf_msrs(vmx); > >> + if (vcpu_to_pmu(&vmx->vcpu)->version) > We may use 'vmx->vcpu.arch.pmu.version'. Thanks for confirm this. Maybe this is better: diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 40b1e61..b20423c 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6567,7 +6567,8 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) pt_guest_enter(vmx); - atomic_switch_perf_msrs(vmx); + if (vcpu_to_pmu(vcpu)->version) + atomic_switch_perf_msrs(vmx); atomic_switch_umwait_control_msr(vmx); if (enable_preemption_timer) > > I would vote in favor of adding the "unlikely (vmx->vcpu.arch.pmu.version)" > check to the atomic_switch_perf_msrs(), which follows pt_guest_enter(vmx). This is hotpath, let's save the cost of function call. Wanpeng > > >> + atomic_switch_perf_msrs(vmx); > >> + > > I just hope the beautiful codes before, I testing this version before > > sending out the patch, ~30 cycles can be saved which means that ~2% > > vmexit time, will update in next version. Let's wait Paolo for other > > opinions below. > > You may factor the cost of the "pmu-> version check' itself (~10 cycles) > into your overall 'micro-optimize' revenue. > > Thanks, > Like Xu > > > > Wanpeng > > > >> Also, (not knowing much about PMU), is > >> "vcpu_to_pmu(&vmx->vcpu)->version" check correct? > >> > >> E.g. in intel_is_valid_msr() correct for Intel PMU or is it stated > >> somewhere that it is generic rule? > >> > >> Also, speaking about cloud providers and the 'micro' nature of this > >> optimization, would it rather make sense to introduce a static branch > >> (the policy to disable vPMU is likely to be host wide, right)? > >> > >> -- > >> Vitaly > >> >