On Thu, Dec 28, 2023 at 03:57:05PM +0800, Tony W Wang-oc wrote: > For Zhaoxin CPUs, the cores' highest frequencies may be different, which > means that cores may run at different max frequencies, > > According to ACPI-spec6 chapter 8.4.7, the per-core highest frequency > value can be obtained via cppc. > > The core with the higher frequency have better performance, which can be > called as preferred core. And better performance can be achieved by > making the scheduler to run tasks on these preferred cores. > > The cpufreq driver can use the highest frequency value as the prioriy of > core to make the scheduler try to get better performace. More specifically, > in the acpi-cpufreq driver use cppc_get_highest_perf() to get highest > frequency value of each core, use sched_set_itmt_core_prio() to set > highest frequency value as core priority, and use sched_set_itmt_support() > provided by ITMT to tell the scheduler to favor on the preferred cores. > > Signed-off-by: Tony W Wang-oc <TonyWWang-oc@xxxxxxxxxxx> > --- > drivers/cpufreq/acpi-cpufreq.c | 56 +++++++++++++++++++++++++++++++++- > 1 file changed, 55 insertions(+), 1 deletion(-) > > diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c > index 37f1cdf46d29..f4c1ff9e4bb0 100644 > --- a/drivers/cpufreq/acpi-cpufreq.c > +++ b/drivers/cpufreq/acpi-cpufreq.c > @@ -663,8 +663,56 @@ static u64 get_max_boost_ratio(unsigned int cpu) > > return div_u64(highest_perf << SCHED_CAPACITY_SHIFT, nominal_perf); > } > + > +/* The work item is needed to avoid CPU hotplug locking issues */ > +static void sched_itmt_work_fn(struct work_struct *work) > +{ > + sched_set_itmt_support(); > +} > + > +static DECLARE_WORK(sched_itmt_work, sched_itmt_work_fn); > + > +static void set_itmt_prio(int cpu) > +{ > + static bool cppc_highest_perf_diff; > + static struct cpumask core_prior_mask; > + u64 highest_perf; > + static u64 max_highest_perf = 0, min_highest_perf = U64_MAX; > + int ret; > + > + ret = cppc_get_highest_perf(cpu, &highest_perf); > + if (ret) > + return; > + > + sched_set_itmt_core_prio(highest_perf, cpu); > + cpumask_set_cpu(cpu, &core_prior_mask); > + > + if (max_highest_perf <= min_highest_perf) { > + if (highest_perf > max_highest_perf) > + max_highest_perf = highest_perf; > + > + if (highest_perf < min_highest_perf) > + min_highest_perf = highest_perf; > + > + if (max_highest_perf > min_highest_perf) { > + /* > + * This code can be run during CPU online under the > + * CPU hotplug locks, so sched_set_itmt_support() > + * cannot be called from here. Queue up a work item > + * to invoke it. > + */ > + cppc_highest_perf_diff = true; > + } > + } > + > + if (cppc_highest_perf_diff && cpumask_equal(&core_prior_mask, cpu_online_mask)) { > + pr_debug("queue a work to set itmt enabled\n"); > + schedule_work(&sched_itmt_work); > + } > +} sched_itmt_work and this function is a duplicate of what the intel_pstate driver already does. It might be good if consolidate in a single place if you are going to pursue this approach.