Hi Ian, Sorry for the delay. On Wed, Jan 15, 2025 at 09:56:59AM -0800, Ian Rogers wrote: > On Wed, Jan 15, 2025 at 9:31 AM Namhyung Kim <namhyung@xxxxxxxxxx> wrote: > > > > On Mon, Jan 13, 2025 at 03:04:26PM -0800, Ian Rogers wrote: > > > On Mon, Jan 13, 2025 at 12:51 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote: > > > > > > > > Hi Ian, > > > > > > > > On Fri, Jan 10, 2025 at 01:33:57PM -0800, Ian Rogers wrote: > > > > > On Fri, Jan 10, 2025 at 11:26 AM Namhyung Kim <namhyung@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Fri, Jan 10, 2025 at 08:42:02AM -0800, Ian Rogers wrote: [...] > > > > > > > A patch lowering the priority of error messages should be independent > > > > > > > of the 4 changes here. I'd be happy if someone follows this series > > > > > > > with a patch doing it. > > > > > > > > > > > > I think the error behavior is a part of this change. > > > > > > > > > > I disagree with it, so I think you need to address my comments. > > > > > > > > You are changing the error behavior by skipping failed events then the > > > > relevant error messages should be handled properly in this patchset. > > > > > > I'm not sure what you are asking and I'm not sure why it matters? > > > Previously you'd asked for all the output to be moved under verbose. > > > > > > If I specify an event that doesn't work with perf record today then it > > > fails. With this patch it fails too. If that event is a core PMU event > > > then there will be an error message for each core PMU that doesn't > > > support the event. So I get 2 error messages on hybrid. This doesn't > > > feel egregious or warrant a new error message mechanism. I would like > > > it so that evsels supported 1 or more PMUs, in which case this would > > > be 1 error message. > > > > > > If I specify perf record today on an uncore event then perf record > > > fails and I get 1 error message for the uncore PMU. The new behavior > > > will be to get 1 error message per uncore PMU. If I'm on a server with > > > 10s of uncore PMUs then maybe the message is spammy, but the command > > > fails today and will continue to fail with this series. I don't see a > > > motivation to change or optimize for this case and again, evsels that > > > support >1 PMU would be the most appropriate fix. > > > > > > The only case where there is no message today but would be with this > > > patch series is for cycles on ARM's neoverse. There will be one > > > warning for the evsel on the SLC PMU. That's one warning and not many. > > > > > > As I've said, if you want a more elaborate error reporting system then > > > take these patches and add it to them. There's a larger refactor to > > > make evsels support >1 PMU that would clean up the many events on > > > server uncore PMUs issue, but that shouldn't be part of this series > > > nor gate it. If you are trying to perf record on uncore PMUs then you > > > already have problems and optimizing the error messages for your > > > mistake, I don't get why it matters? > > > > What about with multiple events in the command line - one of them > > failing with >1 PMUs and the command now succeeds? > > So this would be something like: > ``` > $ perf record -e cycles,instructions,data_read -a sleep 1 > ``` > where data_read is an uncore PMU event. The current behavior is: > ``` > $ perf record -e cycles,instructions,data_read -a sleep 1 > Error: > The sys_perf_event_open() syscall returned with 22 (Invalid argument) > for event (data_read). > "dmesg | grep -i perf" may provide additional information. > ``` > The new behavior is: > ``` > $ perf record -e cycles,instructions,data_read -a sleep 1 > Error: > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_0' > which will be removed. > The sys_perf_event_open() syscall returned with 22 (Invalid argument) > for event (data_read). > "dmesg | grep -i perf" may provide additional information. > > Error: > Failure to open event 'data_read' on PMU 'uncore_imc_free_running_1' > which will be removed. > The sys_perf_event_open() syscall returned with 22 (Invalid argument) > for event (data_read). > "dmesg | grep -i perf" may provide additional information. > > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 3.138 MB perf.data (11670 samples) ] > ``` > > We know nobody does this, as the command currently fails. It succeeds > with this change, because that's the whole point of the change. Well, I think it's because it failed before. New users can come anytime and do whatever they want (or can). They might pass 100 failing events with 1 successful event and it will give a ton of warnings with this. So it'd be better ratelimit the message and make it optional (with -v). But more importantly, I think we should agree on the patch 4 first. Thanks, Namhyung > I'm not offended by seeing the event was being opened on >1 PMU. For the > only currently succeeding situation where this will now warn, the > cycles case on Neoverse because of the buggy event name in ARM's SLC > PMU, there will be 1 warning. For my example the appropriate fix is to > remove the data_read event. For the Neoverse case, specifying the PMU > resolves the issue until ARM fixes their driver.