[PATCH v3 3/3] [RFC] CPUFreq: Add support for cpu-perf-dependencies

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,

This is a continuation of the previous v2, where we focused mostly on the
dt binding.

I am seeking some feedback/comments on the following two approaches.

Intro:
We have seen that in a system where performance control and hardware
description do not match (i.e. per-cpu), we still need the information of
how the v/f lines are shared among the cpus. We call this information
"performance dependencies".
We got this info through the opp-shared (the previous 2 patches aim for
that).

Problem:
How do we share such info (retrieved from a cpufreq driver) to other
consumers that rely on it? I have two proposals.

First proposal:
Having a new cpumask 'dependent_cpus' to represent the
CPU performance dependencies seems a viable and scalable way.

The main reason for a new cpumaks is that although 'related_cpus'
could be (or could have been) used for such purpose, its meaning has
changed over time [1].

Provided that the cpufreq driver will populate dependent_cpus and set
shared_type accordingly, the s/w components that rely on such description
(we focus on energy-model and cpufreq_cooling for now) will always be
provided with the correct information, when picking the new cpumask.

Proposed changes (at high level)(3):

1) cpufreq: Add new dependent_cpus cpumaks in cpufreq_policy

   * New cpumask addition
   <snippet>
struct cpufreq_policy {
        cpumask_var_t           related_cpus; /* Online + Offline CPUs */
        cpumask_var_t           real_cpus; /* Related and present */

+       /*
+        * CPUs with hardware clk/perf dependencies
+        *
+        * For sw components that rely on h/w info of clk dependencies when hw
+        * coordinates. This cpumask should always reflect the hw dependencies.
+        */
+       cpumask_var_t           dependent_cpus;                 /* all clk-dependent cpus */
+
        unsigned int            shared_type; /* ACPI: ANY or ALL affected CPUs
   </snippet>

   * Fallback mechanism for dependent_cpus. With this, s/w components can
     always pick dependent_cpus regardless the coordination type.
   <snippet>
static int cpufreq_online(unsigned int cpu)

                /* related_cpus should at least include policy->cpus. */
                cpumask_copy(policy->related_cpus, policy->cpus);
+
+               /* dependent_cpus should differ only when hw coordination is in place */
+               if (policy->shared_type != CPUFREQ_SHARED_TYPE_HW)
+                       cpumask_copy(policy->dependent_cpus, policy->cpus);
        }
   </snippet>

   * Add sysfs attribute for dependent_cpus

2) drivers/thermal/cpufreq_cooling: Replace related_cpus with dependent_cpus

3) drivers/cpufreq/scmi-cpufreq: Get perf-dependencies from dev_pm_opp_of_get_sharing_cpus()

   * Call dev_pm_opp_of_get_sharing_cpus()

   * Populate policy->dependent_cpus if possible

   * Set policy->shared_type accordingly

   * Provide to EM the correct performance dependencies information
   <snippet>
-       em_dev_register_perf_domain(cpu_dev, nr_opp, &em_cb, policy->cpus);
+       /*
+       * EM needs accurate information about perf boundaries, thus provide the
+       * correct cpumask.
+       */
+       if (policy->shared_type == CPUFREQ_SHARED_TYPE_HW)
+               em_dev_register_perf_domain(cpu_dev, nr_opp, &em_cb,
+                                           policy->dependent_cpus);
+       else
+               em_dev_register_perf_domain(cpu_dev, nr_opp, &em_cb,
+                                           policy->cpus);
   </snippet>

Second proposal:

Another option could be for each driver to store internally the performance
dependencies and let the driver directly provide the correct cpumask for
any consumer.
Whilst this works fine for energy model (see above), it is not the case
(currently) for cpufreq_cooling, thus we would need to add a new interface for
it. That way the driver can call directly the registration function.
This seems to be possible since the CPUFreq core can skip to register
cpufreq_cooling (flag CPUFREQ_IS_COOLING_DEV).

In comparison with the first proposal this one is less scalable since each
driver will have to call the registration functions, while in some cases
the CPUFreq core could do it.

Any comments/preferences?

Many Thanks
Nicola

[1] https://lkml.org/lkml/2020/9/24/399
-- 
2.27.0




[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]


  Powered by Linux