On Thu, Sep 20, 2018 at 06:05:58PM +0800, Wei Wang wrote: > diff --git a/arch/x86/kvm/pmu_intel.c b/arch/x86/kvm/pmu_intel.c > index 5ab4a36..97a29d7 100644 > --- a/arch/x86/kvm/pmu_intel.c > +++ b/arch/x86/kvm/pmu_intel.c > @@ -342,6 +342,47 @@ static void intel_pmu_reset(struct kvm_vcpu *vcpu) > pmu->global_ovf_ctrl = 0; > } > > +int intel_pmu_enable_save_guest_lbr(struct kvm_vcpu *vcpu) > +{ > + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); > + struct perf_event *event; > + struct perf_event_attr attr = { > + .type = PERF_TYPE_RAW, > + .size = sizeof(attr), Bit yuck to imply .config = 0, like that. > + .pinned = true, Note that this can still result in failing to schedule the event, you create a pinned task event, but a pinned cpu event takes precedence and can claim the LBR. How do you deal with that situation, where the LBR is in use by a host event? > + .exclude_host = true, Didn't you mean to use .exclude_guest ? You don't want this thing to run when the guest is running, right? But I still don't like this hack much, what happens if userspace sets that bit? You seems to have forgotten to combine it with PF_VCPU or whatever is the appropriate bit to check. Similarly, how does the existing exclude_{gues,host} crud work? > + .sample_type = PERF_SAMPLE_BRANCH_STACK, What's the point of setting .sample_type if !.sample_period ? > + .branch_sample_type = PERF_SAMPLE_BRANCH_CALL_STACK | > + PERF_SAMPLE_BRANCH_USER | > + PERF_SAMPLE_BRANCH_KERNEL, > + }; I think this function wants a comment to explain wth its doing and why, because if someone stumbles over this code in a few months nobody will remember those things. > + > + if (pmu->guest_lbr_event) > + return 0; > + > + event = perf_event_create_kernel_counter(&attr, -1, current, NULL, > + NULL); > + if (IS_ERR(event)) { > + pr_err("%s: failed %ld\n", __func__, PTR_ERR(event)); > + return -ENOENT; > + } > + pmu->guest_lbr_event = event; > + > + return 0; > +}