> On Oct 7, 2021, at 11:36 PM, Like Xu <like.xu.linux@xxxxxxxxx> wrote: > > On 8/10/2021 1:46 pm, Song Liu wrote: >>> On Oct 7, 2021, at 8:34 PM, Like Xu <like.xu.linux@xxxxxxxxx> wrote: >>> >>> On 30/9/2021 4:05 am, Song Liu wrote: >>>> Hi Kan, >>>>> On Sep 29, 2021, at 9:35 AM, Liang, Kan <kan.liang@xxxxxxxxx> wrote: >>>>> >>>>>>>> - get confirmation that clearing GLOBAL_CTRL is suffient to supress >>>>>>>> PEBS, in which case we can simply remove the PEBS_ENABLE clear. >>>>>>> >>>>>>> How should we confirm this? Can we run some tests for this? Or do we >>>>>>> need hardware experts' input for this? >>>>>> >>>>>> I'll put it on the list to ask the hardware people when I talk to them next. But >>>>>> maybe Kan or Andi know without asking. >>>>> >>>>> If the GLOBAL_CTRL is explicitly disabled, the counters do not count anymore. >>>>> It doesn't matter if PEBS is enabled or not. >>>>> >>>>> See 6c1c07b33eb0 ("perf/x86/intel: Avoid unnecessary PEBS_ENABLE MSR >>>>> access in PMI "). We optimized the PMU handler base on it. >>>> Thanks for these information! >>>> IIUC, all we need is the following on top of bpf-next/master: >>>> diff --git i/arch/x86/events/intel/core.c w/arch/x86/events/intel/core.c >>>> index 1248fc1937f82..d0d357e7d6f21 100644 >>>> --- i/arch/x86/events/intel/core.c >>>> +++ w/arch/x86/events/intel/core.c >>>> @@ -2209,7 +2209,6 @@ intel_pmu_snapshot_branch_stack(struct perf_branch_entry *entries, unsigned int >>>> /* must not have branches... */ >>>> local_irq_save(flags); >>>> __intel_pmu_disable_all(false); /* we don't care about BTS */ >>> >>> If the value passed in is true, does it affect your use case? >>> >>>> - __intel_pmu_pebs_disable_all(); >>> >>> In that case, we can reuse "static __always_inline void intel_pmu_disable_all(void)" >>> regardless of whether PEBS is supported or enabled inside the guest and the host ? >>> >>>> __intel_pmu_lbr_disable(); >>> >>> How about using intel_pmu_lbr_disable_all() to cover Arch LBR? >> We are using LBR without PMI, so there isn't any hardware mechanism to >> stop the LBR, we have to stop it in software. There is always a delay >> between the event triggers and the LBR is stopped. In this window, > > Do you use counters for snapshot branch stack? > > Can the assumption of "without PMI" be broken sine Intel does have > the hardware mechanism like "freeze LBR on counter overflow > (aka, DEBUGCTLMSR_FREEZE_LBRS_ON_PMI)" ? We are capturing LBR on software events. For example, when a complex syscall, such as sys_bpf() and sys_perf_event_open(), returns -EINVAL, it is not obvious what wen wrong. The branch stack at the return (on a kretprobe or fexit) could give us additional information. > >> the LBR is still running and old entries are being replaced by new entries. >> We actually need the old entries before the triggering event, so the key >> design goal here is to minimize the number of branch instructions between >> the event triggers and the LBR is stopped. > > Yes, it makes sense. > >> Here, both __intel_pmu_disable_all(false) and __intel_pmu_lbr_disable() >> are used to optimize for this goal: the fewer branch instructions the >> better. > > Is it possible that we have another LBR in-kernel user in addition to the current > BPF-LBR snapshot user, such as another BPF-LBR snapshot user or a LBR perf user ? I think it is OK to have another user. We just need to capture the LBR entries. In fact, we simply enable LBR by opening a perf_event on each CPU. So from the kernel's point of view, the LBR is owned used by "another user". > > In the intel_pmu_snapshot_[arch]_branch_stack(), what if there is a PMI or NMI handler > to be called before __intel_pmu_lbr_disable(), which means more branch instructions > (assuming we don't use the FREEZE_LBRS_ON_xxx capability)? If we are unlucky and hit an NMI, we may get garbage data. The user will run the test again. > How about try to disable LBR at the earliest possible time, before __intel_pmu_disable_all(false) ? I am not sure which solution is the best here. On bare metal, current version works fine (available in bpf-next tree). > >> After removing __intel_pmu_pebs_disable_all() from >> intel_pmu_snapshot_branch_stack(), we found quite a few LBR entries in >> extable related code. With these entries, snapshot branch stack is not > > Are you saying that you still need to call > __intel_pmu_pebs_disable_all() to maintain precision ? I think we don't need pebs_disable_all. In the VM, pebs_disable_all will trigger "unchecked MSR access error" warning. After removing it, the warning message is gone. However, after we remove pebs_disable_all, we still see too many LBR entries are flushed before LBR is stopped. Most of these new entries are in extable code. I guess this is because the VM access these MSR differently. Thanks, Song