On Thu, Feb 6, 2025 at 8:44 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote: > On Wed, Feb 05, 2025 at 11:44:57PM -0800, Ian Rogers wrote: > > Why was adding a PMU to an event name, working around ARM's PMU bug, > > such an unsurmontable problem that the original change was reverted? > > Because 1 person didn't want to have to write a PMU prefix and > > considered it a monumental regression having to do so. > > Because it's a legacy event 'cycles' and he didn't expect the wildcard > behavior? And someone who say with perf v6.14 can type `perf stat -e data_read ...` and then with your proposal now has to type `perf stat -e uncore_imc_free_running/data_read/ ...` because data_read isn't a core event, this is expected behavior because the error message mentions perf list? > > > 1. RISC-V is working on a solution with the current status and it's not > > > absoluted needed to change the current behavior. > > > > They said to you directly it was what they wanted, that's why I > > reposted this change and it is, has always been, in the cover letter. > > They've then followed up expressing their desire for this behavior but > > having to have a plan b as the original change was reverted and you > > are blocking this change landing. > > So they have the plan B. But still prefer overriding legacy with JSON? Yes. > > > 2. Apple-M is fixed already. > > > > No, James tried to repro the bug on a Juno board, not an Apple M, and > > didn't succeed. I don't know what kernel he tried. I was told by Mark > > Rutland (at LPC) that the tool fix was absolutely necessary and the > > PMU driver wouldn't be fixed, hence the series flipping behavior that > > I thought Intel would most likely block and wasn't keen to do in the > > 1st place (not least wade through all the test behavior changes and > > the bug tail). All of this was premised on a threat of reverting all > > of the hybrid support so that Apple M could be made to work again, and > > I was trying to do a less worse alternative. > > https://lore.kernel.org/r/20231123042922.834425-1-irogers@xxxxxxxxxx > > Sorry, it's not clear to me what's the problem exactly. Can you give me > an example command line? What broke: when arm PMUs were recognized as core and not uncore PMUs, as part of fixing hybrid, we encoded legacy events on them. So arm_blah/cycles/ became a type 0 config 0 event, no extended type as PMU support for that is tested first. A type 0 config 0 event is broken on the Apple-M PMUs, an event that doesn't count or something like that. Because they had a sysfs event of arm_blah/cycles/ before the change the broken legacy encoding on the PMU was never used, the legacy event broke things. Because they had this problem the Apple-M users were used to using arm_blah/cycles/ rather than cycles to avoid legacy events. This change, not your proposal, is making it so that without a PMU they also don't get legacy events because in no uncertain terms it was expressed they weren't going to work. There was a lot of advocating for removing all hybrid support from the tool. > > I don't understand what you are trying to say. I'm saying the behavior > > of perf list in its output is arbitrary. We use the same printing code > > for every kind of event. An aesthetic decision to put things on a line > > does not imply that it is more valid to use or not use a PMU, it just > > happens to be what the tool does. Did I break perf list as if you look > > in old perf list you see: > > ``` > > $ perf list > > List of pre-defined events (to be used in -e or -M): > > > > duration_time [Tool event] > > ... > > ``` > > while now you see: > > ``` > > $ perf list > > List of pre-defined events (to be used in -e or -M): > > ... > > tool: > > duration_time > > [Wall clock interval time in nanoseconds. Unit: tool] > > ... > > ``` > > I'm hoping people find it useful to have the unit documented. > > The most important information I think is the name of the event > (duration_time). It'd be appropriate if you could call it > 'tool/duration_time/' but I'm not sure if it's acceptable cause > tool events are not real PMU events. If so, maybe > > duration_time or tool/duration_time/ > > ? I don't mind showing a PMU and not showing a PMU. duration_time isn't a core event, does it also get allowed no PMU prefix in your new scheme? My point isn't to discuss duration_time it is to point out that `perf list` output isn't sacred and says different things over time. Those things may or may not include a PMU as there has never been any rigor, it is a mush of strings that are printed. In the perf list code we have an event and an alias. In my opinion if something is an alias of something else then it implies having the same perf_event_attr encoding. In your proposal this wouldn't be true for legacy events as it isn't true today. Which has always been my point about wanting to get this fixed. > I think people should use a PMU prefix before wildcard is enabled. I don't understand. You want to break uncore events without a PMU and disable wild carding, then enable wildcarding again. Like I say I think it is better you work on this behavior under a non `-e` command line option. > > > > What happens if an event is both in sysfs and json? Well the sysfs event > > > > will get the description from the json and then I believe it won't > > > > behave as you show. Did the event get broken, as perf list no longer > > > > shows it with a PMU, by having a json description written? I think not > > > > and I think having descriptions with events is a good thing. > > > > > > That's bad. Probably we should fix it takes only one of the sources and > > > change the JSON event not to clash with sysfs. > > > > No, you are talking about breaking everything already, let's not break > > it yet further - not least as we lack a reasonable way to test it. I > > think if you are serious about having such breaking changes then it is > > best you add a new command line option, like with libpfm events. > > I don't want to break things. What's the intended behavior in that case? The behavior is in pmu's update_event, but basically we prefer the json data over the sysfs data: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/pmu.c?h=perf-tools-next#n506 This allows the json/tool data to correct the sysfs data - as well as to add information like descriptions and topic. But my point isn't that I support your let's have two events instead of updating events. I have maintained this behavior as it has always been the behavior and I care about not breaking everything. Something that I assumed was taken for granted hence making `perf top` behave in a way where it is showing samples for processes that have terminated by default. Thanks, Ian