On 12/08/2022 12:16, Pierre Gondois wrote: > From: Pierre Gondois <Pierre.Gondois@xxxxxxx> [...] > find_energy_efficient_cpu() (feec) is now doing: > feec() > \_ for_each_pd(pd) [0] > // get max_spare_cap_cpu and compute_prev_delta > \_ for_each_cpu(pd) [1] > > \_ get_pd_busy_time(pd) [2] > \_ for_each_cpu(pd) > > // evaluate pd energy without the task > \_ get_pd_max_util(pd, -1) [3.0] > \_ for_each_cpu(pd) > \_ compute_energy(pd, -1) > \_ for_each_ps(pd) > > // evaluate pd energy with the task on prev_cpu > \_ get_pd_max_util(pd, prev_cpu) [3.1] > \_ for_each_cpu(pd) > \_ compute_energy(pd, prev_cpu) > \_ for_each_ps(pd) > > // evaluate pd energy with the task on max_spare_cap_cpu > \_ get_pd_max_util(pd, max_spare_cap_cpu) [3.2] > \_ for_each_cpu(pd) > \_ compute_energy(pd, max_spare_cap_cpu) > \_ for_each_ps(pd) > > [3.1] happens only once since prev_cpu is unique. To have an upper > bound of the complexity, [3.1] is taken into account for all pds. > So with the same definitions for nr_pd, nr_cpus and nr_ps, > the complexity is of: > nr_pd * (2 * [nr_cpus in pd] + 3 * ([nr_cpus in pd] + [nr_ps in pd])) > [0] * ( [1] + [2] + [3.0] + [3.1] + [3.2] ) > = 5 * nr_cpus + 3 * nr_ps > > The complexity limit was set to 2048 in: > commit b68a4c0dba3b1 ("sched/topology: Disable EAS on inappropriate > platforms") > to make "EAS usable up to 16 CPUs with per-CPU DVFS and less than 8 > performance states each". For the same platform, the complexity would > actually be of: > 5 * 16 + 3 * 7 = 101 This is somewhat hard to grasp. Example: 16 CPUs w/ per-CPU DVFS and < 8 performance states (OPPs) each C : Complexity Nc : #CPUs in system Ns : Sum of PSs (Performance States) over all PDs Nd : #PDs Nc' : #CPUs in PD Ns' : #PSs in PD (1) Currently we have: C = Nd * (Nc + Ns) Nc = 16, Nd = 16, Ns = 16 * 7 C = 16 * (16 + 16 * 7) = 2048 (2) Your new formula is: Nc' = 1, Ns' = 7 C = Nd * (2 * Nc' + 3 * (Nc' + Ns')) = Nd * (5 * Nc' + 3 * Ns') = 16 * (5 * 1 + 3 * 7) = 416 = 5 * Nc + 3 * Ns I would update the example and leave C ~ at 2048. > Since the EAS complexity was greatly reduced, bigger platforms can > handle EAS. For instance, a platform with 256 CPUs with 256 > performance states each would reach it. To reflect this improvement, > remove the EAS complexity check. > > Signed-off-by: Pierre Gondois <Pierre.Gondois@xxxxxxx> We should definitely align feec()'s implementation with the EM complexity check and documentation. I would suggest that we keep both in place but we update them. > --- > Documentation/scheduler/sched-energy.rst | 37 ++-------------------- > kernel/sched/topology.c | 39 ++---------------------- > 2 files changed, 6 insertions(+), 70 deletions(-) > > diff --git a/Documentation/scheduler/sched-energy.rst b/Documentation/scheduler/sched-energy.rst > index 8fbce5e767d9..3d1d71134d16 100644 > --- a/Documentation/scheduler/sched-energy.rst > +++ b/Documentation/scheduler/sched-energy.rst > @@ -356,38 +356,7 @@ placement. For EAS it doesn't matter whether the EM power values are expressed > in milli-Watts or in an 'abstract scale'. > > > -6.3 - Energy Model complexity > -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > - > -The task wake-up path is very latency-sensitive. When the EM of a platform is > -too complex (too many CPUs, too many performance domains, too many performance > -states, ...), the cost of using it in the wake-up path can become prohibitive. > -The energy-aware wake-up algorithm has a complexity of: > - > - C = Nd * (Nc + Ns) > - > -with: Nd the number of performance domains; Nc the number of CPUs; and Ns the > -total number of OPPs (ex: for two perf. domains with 4 OPPs each, Ns = 8). > - > -A complexity check is performed at the root domain level, when scheduling > -domains are built. EAS will not start on a root domain if its C happens to be > -higher than the completely arbitrary EM_MAX_COMPLEXITY threshold (2048 at the > -time of writing). > - > -If you really want to use EAS but the complexity of your platform's Energy > -Model is too high to be used with a single root domain, you're left with only > -two possible options: > - > - 1. split your system into separate, smaller, root domains using exclusive > - cpusets and enable EAS locally on each of them. This option has the > - benefit to work out of the box but the drawback of preventing load > - balance between root domains, which can result in an unbalanced system > - overall; > - 2. submit patches to reduce the complexity of the EAS wake-up algorithm, > - hence enabling it to cope with larger EMs in reasonable time. > - > - I see value in this paragraph. Obviously it has to match the actual feec() implementation. [...]