On Tue, May 19, 2020 at 09:10:58PM +0800, Xu, Like wrote: > On 2020/5/19 19:15, Peter Zijlstra wrote: > > On Thu, May 14, 2020 at 04:30:53PM +0800, Like Xu wrote: > > > > > diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c > > > index ea4faae56473..db185dca903d 100644 > > > --- a/arch/x86/kvm/vmx/pmu_intel.c > > > +++ b/arch/x86/kvm/vmx/pmu_intel.c > > > @@ -646,6 +646,43 @@ static void intel_pmu_lbr_cleanup(struct kvm_vcpu *vcpu) > > > intel_pmu_free_lbr_event(vcpu); > > > } > > > +static bool intel_pmu_lbr_is_availabile(struct kvm_vcpu *vcpu) > > > +{ > > > + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); > > > + > > > + if (!pmu->lbr_event) > > > + return false; > > > + > > > + if (event_is_oncpu(pmu->lbr_event)) { > > > + intel_pmu_intercept_lbr_msrs(vcpu, false); > > > + } else { > > > + intel_pmu_intercept_lbr_msrs(vcpu, true); > > > + return false; > > > + } > > > + > > > + return true; > > > +} > > This is unreadable gunk, what? > > Abstractly, it is saying "KVM would passthrough the LBR satck MSRs if > event_is_oncpu() is true, otherwise cancel the passthrough state if any." > > I'm using 'event->oncpu != -1' to represent the guest LBR event > is scheduled on rather than 'event->state == PERF_EVENT_STATE_ERROR'. > > For intel_pmu_intercept_lbr_msrs(), false means to passthrough the LBR stack > MSRs to the vCPU, and true means to cancel the passthrough state and make > LBR MSR accesses trapped by the KVM. To me it seems very weird to change state in a function that is supposed to just query state. 'is_available' seems to suggest a simple: return 'lbr_event->state == PERF_EVENT_STATE_ACTIVE' or something. > > > +static void intel_pmu_availability_check(struct kvm_vcpu *vcpu) > > > +{ > > > + lockdep_assert_irqs_disabled(); > > > + > > > + if (lbr_is_enabled(vcpu) && !intel_pmu_lbr_is_availabile(vcpu) && > > > + (vmcs_read64(GUEST_IA32_DEBUGCTL) & DEBUGCTLMSR_LBR)) > > > + pr_warn_ratelimited("kvm: vcpu-%d: LBR is temporarily unavailable.\n", > > > + vcpu->vcpu_id); > > More unreadable nonsense; when the events go into ERROR state, it's a > > permanent fail, they'll not come back. > It's not true. The guest LBR event with 'ERROR state' or 'oncpu != -1' > would be > lazy released and re-created in the next time the > intel_pmu_create_lbr_event() is > called and it's supposed to be re-scheduled and re-do availability_check() > as well. Where? Also, wth would you need to destroy and re-create an event for that? > > > @@ -6696,8 +6696,10 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) > > > pt_guest_enter(vmx); > > > - if (vcpu_to_pmu(vcpu)->version) > > > + if (vcpu_to_pmu(vcpu)->version) { > > > atomic_switch_perf_msrs(vmx); > > > + kvm_x86_ops.pmu_ops->availability_check(vcpu); > > > + } > > AFAICT you just did a call out to the kvm_pmu crud in > > atomic_switch_perf_msrs(), why do another call? > In fact, availability_check() is only called here for just one time. > > The callchain looks like: > - vmx_vcpu_run() > - kvm_x86_ops.pmu_ops->availability_check(); > - intel_pmu_availability_check() > - intel_pmu_lbr_is_availabile() > - event_is_oncpu() ... > What I'm saying is that you just did a pmu_ops indirect call in atomic_switch_perf_msrs(), why add another?