On 12/12/2023 20:27, Ian Rogers wrote: > On Tue, Dec 12, 2023 at 7:06 AM James Clark <james.clark@xxxxxxx> wrote: >> >> >> >> On 29/11/2023 06:02, Ian Rogers wrote: >>> Additional helpers to better replace >>> perf_cpu_map__has_any_cpu_or_is_empty. >>> >>> Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx> >>> --- >>> tools/lib/perf/cpumap.c | 27 +++++++++++++++++++++++++++ >>> tools/lib/perf/include/perf/cpumap.h | 16 ++++++++++++++++ >>> tools/lib/perf/libperf.map | 4 ++++ >>> 3 files changed, 47 insertions(+) >>> >>> diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c >>> index 49fc98e16514..7403819da8fd 100644 >>> --- a/tools/lib/perf/cpumap.c >>> +++ b/tools/lib/perf/cpumap.c >>> @@ -316,6 +316,19 @@ bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map) >>> return map ? __perf_cpu_map__cpu(map, 0).cpu == -1 : true; >>> } >>> >>> +bool perf_cpu_map__is_any_cpu_or_is_empty(const struct perf_cpu_map *map) >>> +{ >>> + if (!map) >>> + return true; >>> + >>> + return __perf_cpu_map__nr(map) == 1 && __perf_cpu_map__cpu(map, 0).cpu == -1; >>> +} >> >> I'm struggling to understand the relevance of the difference between >> has_any and is_any I see that there is a slight difference, but could it >> not be refactored out so we only need one? > > Yep, that's what these changes are working toward. For has any the set > {-1, 0, 1} would return true while is any will return false. > Previously the has any behavior was called "empty" which I think is > actively misleading. > >> Do you ever get an 'any' map that has more than 1 entry? It's quite a >> subtle difference that is_any returns false if the first one is 'any' >> but then there are subsequent entries. Whereas has_any would return >> true. I'm not sure if future readers would be able to appreciate that. >> >> I see has_any is only used twice, both on evlist->all_cpus. Is there >> something about that member that means it could have a map that has an >> 'any' mixed with CPUs? Wouldn't that have the same result as a normal >> 'any' anyway? > > The dummy event may be opened on any CPU but then a particular event > may be opened on certain CPUs. We merge CPU maps in places like evlist > so that we can iterate the appropriate CPUs for events and > open/enable/disable/close all events on a certain CPU at the same time > (we also set the affinity to that CPU to avoid IPIs). What I'm hoping > to do in these changes is reduce the ambiguity, the corner cases are > by their nature unusual. > > An example of a corner case is, uncore events often get opened just on > CPU 0 but on a multi-socket system you may have a CPU 32 that also > needs to open the event. Previous code treated the CPU map index and > value it contained pretty interchangeably. This is often fine for the > core PMU but is clearly wrong in this uncore case, {0, 32} has indexes > 0 and 1 but those indexes don't match the CPU numbers. The case of -1 > has often previously been called dummy but I'm trying to call it the > "any CPU" case to match the perf_event_open man page (I'm hoping it > also makes it less ambiguous with any CPU being used with a particular > event like cycles, calling it dummy makes the event sound like it may > have sideband data). The difference between "all CPUs" and "any CPU" > is that an evsel for all CPUs would need the event opening > individually on each CPU, while any CPU events are a single open call. > Any CPU is only valid to perf_event_open if a PID is specified. > Depending on the set up there could be overlaps in what they count but > hopefully it is clearer what the distinction is. I believe the case of > "any CPU" and specific CPU numbers is more common with aux buffers and > Adrian has mentioned needing it for intel-pt. > > Thanks, > Ian > Thanks for explaining. I suppose I didn't realise that 'any' could be merged with per-cpu maps, but it makes sense. >>> + >>> +bool perf_cpu_map__is_empty(const struct perf_cpu_map *map) >>> +{ >>> + return map == NULL; >>> +} >>> + >>> int perf_cpu_map__idx(const struct perf_cpu_map *cpus, struct perf_cpu cpu) >>> { >>> int low, high; >>> @@ -372,6 +385,20 @@ bool perf_cpu_map__has_any_cpu(const struct perf_cpu_map *map) >>> return map && __perf_cpu_map__cpu(map, 0).cpu == -1; >>> } >>> >>> +struct perf_cpu perf_cpu_map__min(const struct perf_cpu_map *map) >>> +{ >>> + struct perf_cpu cpu, result = { >>> + .cpu = -1 >>> + }; >>> + int idx; >>> + >>> + perf_cpu_map__for_each_cpu_skip_any(cpu, idx, map) { >>> + result = cpu; >>> + break; >>> + } >>> + return result; >>> +} >>> + >>> struct perf_cpu perf_cpu_map__max(const struct perf_cpu_map *map) >>> { >>> struct perf_cpu result = { >>> diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h >>> index dbe0a7352b64..523e4348fc96 100644 >>> --- a/tools/lib/perf/include/perf/cpumap.h >>> +++ b/tools/lib/perf/include/perf/cpumap.h >>> @@ -50,6 +50,22 @@ LIBPERF_API int perf_cpu_map__nr(const struct perf_cpu_map *cpus); >>> * perf_cpu_map__has_any_cpu_or_is_empty - is map either empty or has the "any CPU"/dummy value. >>> */ >>> LIBPERF_API bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map); >>> +/** >>> + * perf_cpu_map__is_any_cpu_or_is_empty - is map either empty or the "any CPU"/dummy value. >>> + */ >>> +LIBPERF_API bool perf_cpu_map__is_any_cpu_or_is_empty(const struct perf_cpu_map *map); >>> +/** >>> + * perf_cpu_map__is_empty - does the map contain no values and it doesn't >>> + * contain the special "any CPU"/dummy value. >>> + */ >>> +LIBPERF_API bool perf_cpu_map__is_empty(const struct perf_cpu_map *map); >>> +/** >>> + * perf_cpu_map__min - the minimum CPU value or -1 if empty or just the "any CPU"/dummy value. >>> + */ >>> +LIBPERF_API struct perf_cpu perf_cpu_map__min(const struct perf_cpu_map *map); >>> +/** >>> + * perf_cpu_map__max - the maximum CPU value or -1 if empty or just the "any CPU"/dummy value. >>> + */ >>> LIBPERF_API struct perf_cpu perf_cpu_map__max(const struct perf_cpu_map *map); >>> LIBPERF_API bool perf_cpu_map__has(const struct perf_cpu_map *map, struct perf_cpu cpu); >>> LIBPERF_API bool perf_cpu_map__equal(const struct perf_cpu_map *lhs, >>> diff --git a/tools/lib/perf/libperf.map b/tools/lib/perf/libperf.map >>> index 10b3f3722642..2aa79b696032 100644 >>> --- a/tools/lib/perf/libperf.map >>> +++ b/tools/lib/perf/libperf.map >>> @@ -10,6 +10,10 @@ LIBPERF_0.0.1 { >>> perf_cpu_map__nr; >>> perf_cpu_map__cpu; >>> perf_cpu_map__has_any_cpu_or_is_empty; >>> + perf_cpu_map__is_any_cpu_or_is_empty; >>> + perf_cpu_map__is_empty; >>> + perf_cpu_map__has_any_cpu; >>> + perf_cpu_map__min; >>> perf_cpu_map__max; >>> perf_cpu_map__has; >>> perf_thread_map__new_array;