On Thu, Mar 28, 2024 at 4:21 AM Ingo Molnar <mingo@xxxxxxxxxx> wrote: > > > * Andrii Nakryiko <andrii@xxxxxxxxxx> wrote: > > > [0] added ability to capture LBR (Last Branch Records) on Intel CPUs > > from inside BPF program at pretty much any arbitrary point. This is > > extremely useful capability that allows to figure out otherwise > > hard-to-debug problems, because LBR is now available based on some > > application-defined conditions, not just hardware-supported events. > > > > retsnoop ([1]) is one such tool that takes a huge advantage of this > > functionality and has proved to be an extremely useful tool in > > practice. > > > > Now, AMD Zen4 CPUs got support for similar LBR functionality, but > > necessary wiring inside the kernel is not yet setup. This patch seeks to > > rectify this and follows a similar approach to the original patch [0] > > for Intel CPUs. > > > > Given LBR can be set up to capture any indirect jumps, it's critical to > > minimize indirect jumps on the way to requesting LBR from BPF program, > > so we split amd_pmu_lbr_disable_all() into a wrapper with some generic > > conditions vs always-inlined __amd_pmu_lbr_disable() called directly > > from BPF subsystem (through perf_snapshot_branch_stack static call). > > > > Now that it's possible to capture LBR on AMD CPU from BPF at arbitrary > > point, there is no reason to artificially limit this feature to sampling > > events. So corresponding check is removed. AFAIU, there is no > > correctness implications of doing this (and it was possible to bypass > > this check by just setting perf_event's sample_period to 1 anyways, so > > it doesn't guard all that much). > > > > This was tested on AMD Bergamo CPU and worked well when utilized from > > the aforementioned retsnoop tool. > > > > [0] https://lore.kernel.org/bpf/20210910183352.3151445-2-songliubraving@xxxxxx/ > > [1] https://github.com/anakryiko/retsnoop > > > > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx> > > --- > > arch/x86/events/amd/core.c | 29 ++++++++++++++++++++++++++++- > > arch/x86/events/amd/lbr.c | 11 +---------- > > arch/x86/events/perf_event.h | 11 +++++++++++ > > 3 files changed, 40 insertions(+), 11 deletions(-) > > Please do not queue these up in the BPF tree, all similar changes to > perf code should go through the perf tree. > Absolutely, I rebased on top of tip's perf/core branch and sent it as v2. Thanks! > Thanks, > > Ingo