On 2025-01-21 11:59 a.m., Lucas De Marchi wrote: > On Tue, Jan 21, 2025 at 10:53:31AM -0500, Liang, Kan wrote: >> >> >> On 2025-01-21 9:29 a.m., Lucas De Marchi wrote: >>> On Mon, Jan 20, 2025 at 08:42:41PM -0500, Liang, Kan wrote: >>>>>>> -static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node >>>>>>> *node) >>>>>>> -{ >>>>>>> - struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), >>>>>>> cpuhp.node); >>>>>>> - unsigned int target = i915_pmu_target_cpu; >>>>>>> - >>>>>>> - /* >>>>>>> - * Unregistering an instance generates a CPU offline event >>>>>>> which >>>>>>> we must >>>>>>> - * ignore to avoid incorrectly modifying the shared >>>>>>> i915_pmu_cpumask. >>>>>>> - */ >>>>>>> - if (!pmu->registered) >>>>>>> - return 0; >>>>>>> - >>>>>>> - if (cpumask_test_and_clear_cpu(cpu, &i915_pmu_cpumask)) { >>>>>>> - target = cpumask_any_but(topology_sibling_cpumask(cpu), >>>>>>> cpu); >>>>>>> - >>>>>> >>>>>> I'm not familar with the i915 PMU, but it seems suggest a core scope >>>>>> PMU, not a system-wide scope. >>>>> >>>>> counter is in a complete separate device - it doesn't depend on >>>>> core or >>>>> die or pkg - not sure why it cared about topology_sibling_cpumask >>>>> here. >>>> >>>> OK. But it's still a behavior change. Please make it clear in the >>>> description that the patch also changes/fixes the scope from core scope >>>> to system-wide. >>> >>> sure... do you have a suggestion how to test the hotplug? For testing >>> purposes, can I force the perf cpu assigned to be something other than >>> the cpu0? >> >> Yes, it's a bit tricky to verify the hotplug if the assigned CPU is >> CPU0. I don't know a way to force another CPU without changing the code. >> You may have to instrument the code for the test. >> >> Another test you may want to do is the perf system-wide test, e.g., perf >> stat -a -e i915/actual-frequency/ sleep 1. >> >> The existing code assumes the counter is core scope. So the result >> should be huge, since perf will read the counter on each core and add >> them up. > > that is not allowed and it simply fails to init the counter: > > static int i915_pmu_event_init(struct perf_event *event) > ... > if (event->cpu < 0) > return -EINVAL; > if (!cpumask_test_cpu(event->cpu, &i915_pmu_cpumask)) > return -EINVAL; > ... > } > > event only succeeds the initialization in the assigned cpu. I see no > differences in results (using i915/interrupts/ since freq is harder to > compare): > > $ sudo perf stat -e i915/interrupts/ sleep 1 > > Performance counter stats for 'system wide': > > 253 i915/ > interrupts/ > 1.002215175 seconds time elapsed > > $ sudo perf stat -a -e i915/interrupts/ sleep 1 > > Performance counter stats for 'system wide': > > 251 i915/ > interrupts/ > 1.000900818 seconds time elapsed > > Note that our cpumask attr already returns just the assigned cpu and > perf-stat only tries to open on that cpu: > > $ strace --follow -s 1024 -e perf_event_open -- perf stat -a -e i915/ > interrupts/ sleep 1 > > [pid 55777] perf_event_open({type=0x24 /* PERF_TYPE_??? */, size=0x88 /* > PERF_ATTR_SIZE_??? */, config=0x100002, sample_period=0, > sample_type=PERF_SAMPLE_IDENTIFIER, > read_format=PERF_FORMAT_TOTAL_TIME_ENABLED| > PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, precise_ip=0 /* > arbitrary skid */, exclude_guest=1, ...}, -1, 0, -1, > PERF_FLAG_FD_CLOEXEC) = 3 > I see. The behavior is not changed with the patch. It should be just the topology_sibling_cpumask() which implies a misleading message. Thanks for the confirmation. > Lucas De Marchi > >> But this patch claims that the counter is system-wide. With the patch, >> the same perf command should only read the counter on the assigned CPU. >> >> Please also post the test results in the changelog. That's the reason >> why the scope has to be changed. > > it seems that migration code is simply wrong, not that we are changing > the scope here - it was already considered system-wide. I can add a > paragraph in the commit message explaining it. > Yes, please. Thanks, Kan