On 10/09/2024 10:59, tip-bot2 for Kan Liang wrote: > The following commit has been merged into the perf/core branch of tip: > > Commit-ID: 4ba4f1afb6a9fed8ef896c2363076e36572f71da > Gitweb: https://git.kernel.org/tip/4ba4f1afb6a9fed8ef896c2363076e36572f71da > Author: Kan Liang <kan.liang@xxxxxxxxxxxxxxx> > AuthorDate: Fri, 02 Aug 2024 08:16:37 -07:00 > Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > CommitterDate: Tue, 10 Sep 2024 11:44:12 +02:00 > > perf: Generic hotplug support for a PMU with a scope > > The perf subsystem assumes that the counters of a PMU are per-CPU. So > the user space tool reads a counter from each CPU in the system wide > mode. However, many PMUs don't have a per-CPU counter. The counter is > effective for a scope, e.g., a die or a socket. To address this, a > cpumask is exposed by the kernel driver to restrict to one CPU to stand > for a specific scope. In case the given CPU is removed, > the hotplug support has to be implemented for each such driver. > > The codes to support the cpumask and hotplug are very similar. > - Expose a cpumask into sysfs > - Pickup another CPU in the same scope if the given CPU is removed. > - Invoke the perf_pmu_migrate_context() to migrate to a new CPU. > - In event init, always set the CPU in the cpumask to event->cpu > > Similar duplicated codes are implemented for each such PMU driver. It > would be good to introduce a generic infrastructure to avoid such > duplication. > > 5 popular scopes are implemented here, core, die, cluster, pkg, and > the system-wide. The scope can be set when a PMU is registered. If so, a > "cpumask" is automatically exposed for the PMU. > > The "cpumask" is from the perf_online_<scope>_mask, which is to track > the active CPU for each scope. They are set when the first CPU of the > scope is online via the generic perf hotplug support. When a > corresponding CPU is removed, the perf_online_<scope>_mask is updated > accordingly and the PMU will be moved to a new CPU from the same scope > if possible. > > Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx> > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > Link: https://lore.kernel.org/r/20240802151643.1691631-2-kan.liang@xxxxxxxxxxxxxxx > --- > include/linux/perf_event.h | 18 ++++- > kernel/events/core.c | 164 +++++++++++++++++++++++++++++++++++- > 2 files changed, 180 insertions(+), 2 deletions(-) > [...] > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 67e115d..5ff9735 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c [...] > @@ -13856,6 +13980,42 @@ static void perf_event_exit_cpu_context(int cpu) { } > > #endif > > +static void perf_event_setup_cpumask(unsigned int cpu) > +{ > + struct cpumask *pmu_cpumask; > + unsigned int scope; > + > + cpumask_set_cpu(cpu, perf_online_mask); > + > + /* > + * Early boot stage, the cpumask hasn't been set yet. > + * The perf_online_<domain>_masks includes the first CPU of each domain. > + * Always uncondifionally set the boot CPU for the perf_online_<domain>_masks. ^^^^^^^^^^^^^^^ typo > + */ > + if (!topology_sibling_cpumask(cpu)) { This causes a compiler warning: > kernel/events/core.c: In function 'perf_event_setup_cpumask': > kernel/events/core.c:14012:13: error: the comparison will always evaluate as 'true' for the address of 'thread_sibling' will never be NULL [-Werror=address] > 14012 | if (!topology_sibling_cpumask(cpu)) { > | ^ > In file included from ./include/linux/topology.h:30, > from ./include/linux/gfp.h:8, > from ./include/linux/xarray.h:16, > from ./include/linux/list_lru.h:14, > from ./include/linux/fs.h:13, > from kernel/events/core.c:11: > ./include/linux/arch_topology.h:78:19: note: 'thread_sibling' declared here > 78 | cpumask_t thread_sibling; > | ^~~~~~~~~~~~~~ > cc1: all warnings being treated as errors Steve