On 3/28/2022 1:11 PM, Stephane Eranian wrote:
On Mon, Mar 28, 2022 at 8:50 AM <kan.liang@xxxxxxxxxxxxxxx> wrote:
From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
The INST_RETIRED.PREC_DIST event (0x0100) doesn't count on SPR.
perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0
Performance counter stats for 'CPU(s) 0':
607,246 cpu/event=0xc0,umask=0x0/
0 cpu/event=0x0,umask=0x1/
The encoding for INST_RETIRED.PREC_DIST is pseudo-encoding, which
doesn't work on the generic counters. However, current perf extends its
mask to the generic counters.
The pseudo event-code for a fixed counter must be 0x00. Check and avoid
extending the mask for the fixed counter event which using the
pseudo-encoding, e.g., ref-cycles and PREC_DIST event.
With the patch,
perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0
Performance counter stats for 'CPU(s) 0':
583,184 cpu/event=0xc0,umask=0x0/
583,048 cpu/event=0x0,umask=0x1/
Fixes: 2de71ee153ef ("perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings")
Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
---
arch/x86/events/intel/core.c | 6 +++++-
arch/x86/include/asm/perf_event.h | 5 +++++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index db32ef6..1d2e49d 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5668,7 +5668,11 @@ static void intel_pmu_check_event_constraints(struct event_constraint *event_con
/* Disabled fixed counters which are not in CPUID */
c->idxmsk64 &= intel_ctrl;
- if (c->idxmsk64 != INTEL_PMC_MSK_FIXED_REF_CYCLES)
+ /*
+ * Don't extend the pseudo-encoding to the
+ * generic counters
+ */
+ if (!use_fixed_pseudo_encoding(c->code))
c->idxmsk64 |= (1ULL << num_counters) - 1;
}
c->idxmsk64 &=
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 48e6ef56..cd85f03 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -242,6 +242,11 @@ struct x86_pmu_capability {
#define INTEL_PMC_IDX_FIXED_SLOTS (INTEL_PMC_IDX_FIXED + 3)
#define INTEL_PMC_MSK_FIXED_SLOTS (1ULL << INTEL_PMC_IDX_FIXED_SLOTS)
+static inline bool use_fixed_pseudo_encoding(u64 code)
+{
+ return !(code & 0xff);
+}
+
I ack the problem.
That does not take into account the old encoding for PREC_DIST 0x01c0
which is also forced to
fixed counter0 on ICL and should not be extended.
The old encoding is not documented in the ICL event list now. The only
PREC_DIST event for ICL is using the pseudo encoding.
{
"EventCode": "0x00",
"UMask": "0x01",
"EventName": "INST_RETIRED.PREC_DIST",
"BriefDescription": "Precise instruction retired event with a
reduced effect of PEBS shadow in IP distribution",
"PublicDescription": "A version of INST_RETIRED that allows for a
more unbiased distribution of samples across instructions retired. It
utilizes the Precise Distribution of Instructions Retired (PDIR) feature
to mitigate some bias in how retired instructions get sampled. Use on
Fixed Counter 0.",
"Counter": "Fixed counter 0",
Ideally, I think we should remove the old encoding 0x01c0 from the
constraints table rather than force it to fixed counter 0 only.
If so, that should be a separate patch.
That also limits the options for the SLOTS events which can be
measured by a GP. Yet to work
with PERF_METRICS, it has to be programmed into fixed counter 3.
For the SLOTS event which can only work with PERF_METRICS, the current
perf already limit it as below.
FIXED_EVENT_CONSTRAINT(0x0400, 3), /* SLOTS */
No behavior is changed with this patch.
For the GP version of SLOTS, it's 0x01a4. According to the event list,
it can be scheduled on all GP counters. So it's not added into the
constraints table.
"EventCode": "0xa4",
"UMask": "0x01",
"EventName": "TOPDOWN.SLOTS_P",
"BriefDescription": "TMA slots available for an unhalted logical
processor. General counter - architectural event",
"PublicDescription": "Counts the number of available slots for an
unhalted logical processor. The event increments by machine-width of the
narrowest pipeline as employed by the Top-down Microarchitecture
Analysis method. The count is distributed among unhalted logical
processors (hyper-threads) who share the same physical core.",
"Counter": "0,1,2,3,4,5,6,7",
"PEBScounters": "0,1,2,3,4,5,6,7",
Even we finally decide to extend the 0x01a4 to the fixed counter 3 and
add an entry FIXED_EVENT_CONSTRAINT(0x01a4, 3) in the constraints table.
This patch doesn't limit it.
Thanks,
Kan
/*
* We model BTS tracing as another fixed-mode PMC.
*
--
2.7.4