On Thu, Feb 6, 2025 at 10:15 PM Ian Rogers <irogers@xxxxxxxxxx> wrote: > > On Thu, Feb 6, 2025 at 8:44 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote: > > On Wed, Feb 05, 2025 at 11:44:57PM -0800, Ian Rogers wrote: > > > Why was adding a PMU to an event name, working around ARM's PMU bug, > > > such an unsurmontable problem that the original change was reverted? > > > Because 1 person didn't want to have to write a PMU prefix and > > > considered it a monumental regression having to do so. > > > > Because it's a legacy event 'cycles' and he didn't expect the wildcard > > behavior? > > And someone who say with perf v6.14 can type `perf stat -e data_read > ...` and then with your proposal now has to type `perf stat -e > uncore_imc_free_running/data_read/ ...` because data_read isn't a core > event, this is expected behavior because the error message mentions > perf list? > > > > > 1. RISC-V is working on a solution with the current status and it's not > > > > absoluted needed to change the current behavior. > > > > > > They said to you directly it was what they wanted, that's why I > > > reposted this change and it is, has always been, in the cover letter. > > > They've then followed up expressing their desire for this behavior but > > > having to have a plan b as the original change was reverted and you > > > are blocking this change landing. > > > > So they have the plan B. But still prefer overriding legacy with JSON? > > Yes. > Even though the driver encoding was envisioned as plan B, I think we have to keep that irrespective of legacy overriding with json is available or not due to the reasons I iterated earlier (e.g direct legacy event usage and hypervisor) and some renewed interest in standardizing event encodings in RISC-V[1] If the overriding legacy with JSON is available, each future vendor may just provide the json file instead of modifying the driver. However, it will be a matter of convenience and clutter free future rather than a necessity at this point. [1] https://lists.riscv.org/g/sig-perf-analysis/topic/110906276#msg458 > > > > 2. Apple-M is fixed already. > > > > > > No, James tried to repro the bug on a Juno board, not an Apple M, and > > > didn't succeed. I don't know what kernel he tried. I was told by Mark > > > Rutland (at LPC) that the tool fix was absolutely necessary and the > > > PMU driver wouldn't be fixed, hence the series flipping behavior that > > > I thought Intel would most likely block and wasn't keen to do in the > > > 1st place (not least wade through all the test behavior changes and > > > the bug tail). All of this was premised on a threat of reverting all > > > of the hybrid support so that Apple M could be made to work again, and > > > I was trying to do a less worse alternative. > > > https://lore.kernel.org/r/20231123042922.834425-1-irogers@xxxxxxxxxx > > > > Sorry, it's not clear to me what's the problem exactly. Can you give me > > an example command line? > > What broke: when arm PMUs were recognized as core and not uncore PMUs, > as part of fixing hybrid, we encoded legacy events on them. So > arm_blah/cycles/ became a type 0 config 0 event, no extended type as > PMU support for that is tested first. A type 0 config 0 event is > broken on the Apple-M PMUs, an event that doesn't count or something > like that. Because they had a sysfs event of arm_blah/cycles/ before > the change the broken legacy encoding on the PMU was never used, the > legacy event broke things. > > Because they had this problem the Apple-M users were used to using > arm_blah/cycles/ rather than cycles to avoid legacy events. This > change, not your proposal, is making it so that without a PMU they > also don't get legacy events because in no uncertain terms it was > expressed they weren't going to work. There was a lot of advocating > for removing all hybrid support from the tool. > > > > I don't understand what you are trying to say. I'm saying the behavior > > > of perf list in its output is arbitrary. We use the same printing code > > > for every kind of event. An aesthetic decision to put things on a line > > > does not imply that it is more valid to use or not use a PMU, it just > > > happens to be what the tool does. Did I break perf list as if you look > > > in old perf list you see: > > > ``` > > > $ perf list > > > List of pre-defined events (to be used in -e or -M): > > > > > > duration_time [Tool event] > > > ... > > > ``` > > > while now you see: > > > ``` > > > $ perf list > > > List of pre-defined events (to be used in -e or -M): > > > ... > > > tool: > > > duration_time > > > [Wall clock interval time in nanoseconds. Unit: tool] > > > ... > > > ``` > > > I'm hoping people find it useful to have the unit documented. > > > > The most important information I think is the name of the event > > (duration_time). It'd be appropriate if you could call it > > 'tool/duration_time/' but I'm not sure if it's acceptable cause > > tool events are not real PMU events. If so, maybe > > > > duration_time or tool/duration_time/ > > > > ? > > I don't mind showing a PMU and not showing a PMU. duration_time isn't > a core event, does it also get allowed no PMU prefix in your new > scheme? My point isn't to discuss duration_time it is to point out > that `perf list` output isn't sacred and says different things over > time. Those things may or may not include a PMU as there has never > been any rigor, it is a mush of strings that are printed. > > In the perf list code we have an event and an alias. In my opinion if > something is an alias of something else then it implies having the > same perf_event_attr encoding. In your proposal this wouldn't be true > for legacy events as it isn't true today. Which has always been my > point about wanting to get this fixed. > > > I think people should use a PMU prefix before wildcard is enabled. > > I don't understand. You want to break uncore events without a PMU and > disable wild carding, then enable wildcarding again. Like I say I > think it is better you work on this behavior under a non `-e` command > line option. > > > > > > What happens if an event is both in sysfs and json? Well the sysfs event > > > > > will get the description from the json and then I believe it won't > > > > > behave as you show. Did the event get broken, as perf list no longer > > > > > shows it with a PMU, by having a json description written? I think not > > > > > and I think having descriptions with events is a good thing. > > > > > > > > That's bad. Probably we should fix it takes only one of the sources and > > > > change the JSON event not to clash with sysfs. > > > > > > No, you are talking about breaking everything already, let's not break > > > it yet further - not least as we lack a reasonable way to test it. I > > > think if you are serious about having such breaking changes then it is > > > best you add a new command line option, like with libpfm events. > > > > I don't want to break things. What's the intended behavior in that case? > > The behavior is in pmu's update_event, but basically we prefer the > json data over the sysfs data: > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/pmu.c?h=perf-tools-next#n506 > This allows the json/tool data to correct the sysfs data - as well as > to add information like descriptions and topic. > But my point isn't that I support your let's have two events instead > of updating events. I have maintained this behavior as it has always > been the behavior and I care about not breaking everything. Something > that I assumed was taken for granted hence making `perf top` behave in > a way where it is showing samples for processes that have terminated > by default. > > Thanks, > Ian