Patch "perf evsel: Fixes topdown events in a weak group for the hybrid platform" has been added to the 5.17-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    perf evsel: Fixes topdown events in a weak group for the hybrid platform

to the 5.17-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     perf-evsel-fixes-topdown-events-in-a-weak-group-for-.patch
and it can be found in the queue-5.17 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 7fa4e92667ce3fe6600183c83754e7a325c8bf3a
Author: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
Date:   Wed May 18 07:38:57 2022 -0700

    perf evsel: Fixes topdown events in a weak group for the hybrid platform
    
    [ Upstream commit 39d5f412da84784bcc7f39ed49e55376be526fc7 ]
    
    The patch ("perf evlist: Keep topdown counters in weak group") fixes the
    perf metrics topdown event issue when the topdown events are in a weak
    group on a non-hybrid platform. However, it doesn't work for the hybrid
    platform.
    
      $./perf stat -e '{cpu_core/slots/,cpu_core/topdown-bad-spec/,
      cpu_core/topdown-be-bound/,cpu_core/topdown-fe-bound/,
      cpu_core/topdown-retiring/,cpu_core/branch-instructions/,
      cpu_core/branch-misses/,cpu_core/bus-cycles/,cpu_core/cache-misses/,
      cpu_core/cache-references/,cpu_core/cpu-cycles/,cpu_core/instructions/,
      cpu_core/mem-loads/,cpu_core/mem-stores/,cpu_core/ref-cycles/,
      cpu_core/cache-misses/,cpu_core/cache-references/}:W' -a sleep 1
    
      Performance counter stats for 'system wide':
    
           751,765,068      cpu_core/slots/                        (84.07%)
       <not supported>      cpu_core/topdown-bad-spec/
       <not supported>      cpu_core/topdown-be-bound/
       <not supported>      cpu_core/topdown-fe-bound/
       <not supported>      cpu_core/topdown-retiring/
            12,398,197      cpu_core/branch-instructions/          (84.07%)
             1,054,218      cpu_core/branch-misses/                (84.24%)
           539,764,637      cpu_core/bus-cycles/                   (84.64%)
                14,683      cpu_core/cache-misses/                 (84.87%)
             7,277,809      cpu_core/cache-references/             (77.30%)
           222,299,439      cpu_core/cpu-cycles/                   (77.28%)
            63,661,714      cpu_core/instructions/                 (84.85%)
                     0      cpu_core/mem-loads/                    (77.29%)
            12,271,725      cpu_core/mem-stores/                   (77.30%)
           542,241,102      cpu_core/ref-cycles/                   (84.85%)
                 8,854      cpu_core/cache-misses/                 (76.71%)
             7,179,013      cpu_core/cache-references/             (76.31%)
    
             1.003245250 seconds time elapsed
    
    A hybrid platform has a different PMU name for the core PMUs, while
    the current perf hard code the PMU name "cpu".
    
    The evsel->pmu_name can be used to replace the "cpu" to fix the issue.
    For a hybrid platform, the pmu_name must be non-NULL. Because there are
    at least two core PMUs. The PMU has to be specified.
    For a non-hybrid platform, the pmu_name may be NULL. Because there is
    only one core PMU, "cpu". For a NULL pmu_name, we can safely assume that
    it is a "cpu" PMU.
    
    In case other PMUs also define the "slots" event, checking the PMU type
    as well.
    
    With the patch,
    
      $ perf stat -e '{cpu_core/slots/,cpu_core/topdown-bad-spec/,
      cpu_core/topdown-be-bound/,cpu_core/topdown-fe-bound/,
      cpu_core/topdown-retiring/,cpu_core/branch-instructions/,
      cpu_core/branch-misses/,cpu_core/bus-cycles/,cpu_core/cache-misses/,
      cpu_core/cache-references/,cpu_core/cpu-cycles/,cpu_core/instructions/,
      cpu_core/mem-loads/,cpu_core/mem-stores/,cpu_core/ref-cycles/,
      cpu_core/cache-misses/,cpu_core/cache-references/}:W' -a sleep 1
    
      Performance counter stats for 'system wide':
    
         766,620,266   cpu_core/slots/                                        (84.06%)
          73,172,129   cpu_core/topdown-bad-spec/ #    9.5% bad speculation   (84.06%)
         193,443,341   cpu_core/topdown-be-bound/ #    25.0% backend bound    (84.06%)
         403,940,929   cpu_core/topdown-fe-bound/ #    52.3% frontend bound   (84.06%)
         102,070,237   cpu_core/topdown-retiring/ #    13.2% retiring         (84.06%)
          12,364,429   cpu_core/branch-instructions/                          (84.03%)
           1,080,124   cpu_core/branch-misses/                                (84.24%)
         564,120,383   cpu_core/bus-cycles/                                   (84.65%)
              36,979   cpu_core/cache-misses/                                 (84.86%)
           7,298,094   cpu_core/cache-references/                             (77.30%)
         227,174,372   cpu_core/cpu-cycles/                                   (77.31%)
          63,886,523   cpu_core/instructions/                                 (84.87%)
                   0   cpu_core/mem-loads/                                    (77.31%)
          12,208,782   cpu_core/mem-stores/                                   (77.31%)
         566,409,738   cpu_core/ref-cycles/                                   (84.87%)
              23,118   cpu_core/cache-misses/                                 (76.71%)
           7,212,602   cpu_core/cache-references/                             (76.29%)
    
           1.003228667 seconds time elapsed
    
    Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
    Acked-by: Ian Rogers <irogers@xxxxxxxxxx>
    Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
    Cc: Andi Kleen <ak@xxxxxxxxxxxxxxx>
    Cc: Ingo Molnar <mingo@xxxxxxxxxx>
    Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
    Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
    Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
    Cc: Stephane Eranian <eranian@xxxxxxxxxx>
    Cc: Xing Zhengjun <zhengjun.xing@xxxxxxxxxxxxxxx>
    Link: https://lore.kernel.org/r/20220518143900.1493980-2-kan.liang@xxxxxxxxxxxxxxx
    Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c
index 0c9e56ab07b5..ff4561b7b600 100644
--- a/tools/perf/arch/x86/util/evsel.c
+++ b/tools/perf/arch/x86/util/evsel.c
@@ -31,10 +31,29 @@ void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr)
 	free(env.cpuid);
 }
 
+/* Check whether the evsel's PMU supports the perf metrics */
+static bool evsel__sys_has_perf_metrics(const struct evsel *evsel)
+{
+	const char *pmu_name = evsel->pmu_name ? evsel->pmu_name : "cpu";
+
+	/*
+	 * The PERF_TYPE_RAW type is the core PMU type, e.g., "cpu" PMU
+	 * on a non-hybrid machine, "cpu_core" PMU on a hybrid machine.
+	 * The slots event is only available for the core PMU, which
+	 * supports the perf metrics feature.
+	 * Checking both the PERF_TYPE_RAW type and the slots event
+	 * should be good enough to detect the perf metrics feature.
+	 */
+	if ((evsel->core.attr.type == PERF_TYPE_RAW) &&
+	    pmu_have_event(pmu_name, "slots"))
+		return true;
+
+	return false;
+}
+
 bool arch_evsel__must_be_in_group(const struct evsel *evsel)
 {
-	if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) ||
-	    !pmu_have_event("cpu", "slots"))
+	if (!evsel__sys_has_perf_metrics(evsel))
 		return false;
 
 	return evsel->name &&



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux