On Fri, Jan 10, 2025 at 11:40 AM Namhyung Kim <namhyung@xxxxxxxxxx> wrote: > > On Thu, Jan 09, 2025 at 02:21:09PM -0800, Ian Rogers wrote: > > Originally posted and merged from: > > https://lore.kernel.org/r/20240416061533.921723-10-irogers@xxxxxxxxxx > > This reverts commit 4f1b067359ac8364cdb7f9fda41085fa85789d0f although > > the patch is now smaller due to related fixes being applied in commit > > 22a4db3c3603 ("perf evsel: Add alternate_hw_config and use in > > evsel__match"). > > The original commit message was: > > > > It was requested that RISC-V be able to add events to the perf tool so > > the PMU driver didn't need to map legacy events to config encodings: > > https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@xxxxxxxxxxxx/ > > > > This change makes the priority of events specified without a PMU the > > same as those specified with a PMU, namely sysfs and JSON events are > > checked first before using the legacy encoding. > > I'm still not convinced why we need this change despite of these > troubles. If it's because RISC-V cannot define the lagacy hardware > events in the kernel driver, why not using a different name in JSON and > ask users to use the name specifically? Something like: > > $ perf record -e riscv-cycles ... So ARM and RISC-V are more than able to speak for themselves and have their tags on the series, but let's recap why I'm motivated to do this change: 1) perf supported legacy events; 2) perf supported sysfs and json events, but at a lower priority than legacy events; 3) hybrid support was added but in a way where all the hybrid PMUs needed to be known, assumptions about PMU were implicit and baked into the tool; 4) metric support for hybrid was going in a similar implicit direction and I objected, what would cycles mean in a metric if the core PMU was implicit? Rather than pursue this the hybrid code was overhauled, PMUs became more of a thing and we added a notion of a "core" PMU which would support legacy events; 5) ARM core PMUs differ in naming, etc. than just about every other platform. Their core events had been being programmed as if they were uncore events - ie without the legacy priority. Fixing hybrid, and fixing ARM PMUs to know they supported legacy events, broke perf on Apple-M? series due to a PMU driver issue with legacy events: https://lore.kernel.org/lkml/08f1f185-e259-4014-9ca4-6411d5c1bc65@xxxxxxxxx/ "Perf broke on all Apple ARM64 systems (tested almost everything), and according to maz also on Juno (so, probably all big.LITTLE) since v6.5." 6) sysfs/json events were made the priority over legacy to unbreak perf on Apple-M? CPUs, but only if the PMU is specified: https://lore.kernel.org/r/20231123042922.834425-1-irogers@xxxxxxxxxx Reported-by: Hector Martin <marcan@xxxxxxxxx> Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx> Tested-by: Hector Martin <marcan@xxxxxxxxx> Tested-by: Marc Zyngier <maz@xxxxxxxxxx> Acked-by: Mark Rutland <mark.rutland@xxxxxxx> This gets us to the current code where I can trivially get an inconsistency. Here on Intel with no PMU in the event name: ``` $ perf stat -vv -e cpu-cycles true Using CPUID GenuineIntel-6-8D-1 Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 752915 cpu -1 group_fd -1 flags 0x8 = 3 cpu-cycles: -1: 1293076 273429 273429 cpu-cycles: 1293076 273429 273429 Performance counter stats for 'true': 1,293,076 cpu-cycles 0.000809752 seconds time elapsed 0.000841000 seconds user 0.000000000 seconds sys ``` Here with a PMU event name: ``` $ sudo perf stat -vv -e cpu/cpu-cycles/ true Using CPUID GenuineIntel-6-8D-1 Attempt to add: cpu/cpu-cycles=0/ ..after resolving event: cpu/event=0x3c/ Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 4 (cpu) size 136 config 0x3c (cpu-cycles) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 752839 cpu -1 group_fd -1 flags 0x8 = 3 cpu/cpu-cycles/: -1: 1421235 531150 531150 cpu/cpu-cycles/: 1421235 531150 531150 Performance counter stats for 'true': 1,421,235 cpu/cpu-cycles/ 0.001292908 seconds time elapsed 0.001340000 seconds user 0.000000000 seconds sys ``` That is the no PMU event is opened as type=0/config=0 (legacy) while the PMU event is opened as type=4/config=0x3c (sysfs encoding). Now let's cross our fingers and hope that in the driver they are really the same thing. I take objection to the idea that there should be two different priorities for sysfs/json and legacy depending on whether a PMU is or isn't specified in the event name. The priority could be legacy then sysfs/json, or it could be sysfs/json then legacy, but it should be the same regardless of whether the PMU is put in the event name. The PMU in the event name should be optional, for example we may or may not show it in the stat output. The encoding being consistent was the case prior to the Apple-M? fix and this patch aims to make it consistent once more. Given the ARM bug mentioned above it should also fix specifying or not the PMU on Apple-M? CPUs as it will avoid the same legacy event issue that may only exist on old kernels. RISC-V is motivated because of not wanting hard coded legacy events in the driver for all potential vendors and models. Thanks, Ian