Hi Valentin, On Wednesday 29 Jan 2020 at 17:13:53 (+0000), Valentin Schneider wrote: > Only commenting on the bits that should be there regardless of using the > workqueues or not; > > On 18/12/2019 18:26, Ionela Voinescu wrote: > > +static void cpu_amu_fie_init_workfn(struct work_struct *work) > > +{ > > + u64 core_cnt, const_cnt, ratio; > > + struct cpu_amu_work *amu_work; > > + int cpu = smp_processor_id(); > > + > > + if (!cpu_has_amu_feat()) { > > + pr_debug("CPU%d: counters are not supported.\n", cpu); > > + return; > > + } > > + > > + core_cnt = read_sysreg_s(SYS_AMEVCNTR0_CORE_EL0); > > + const_cnt = read_sysreg_s(SYS_AMEVCNTR0_CONST_EL0); > > + > > + if (unlikely(!core_cnt || !const_cnt)) { > > + pr_err("CPU%d: cycle counters are not enabled.\n", cpu); > > + return; > > + } > > + > > + amu_work = container_of(work, struct cpu_amu_work, cpu_work); > > + if (unlikely(!(amu_work->cpuinfo_max_freq))) { > > + pr_err("CPU%d: invalid maximum frequency.\n", cpu); > > + return; > > + } > > + > > + /* > > + * Pre-compute the fixed ratio between the frequency of the > > + * constant counter and the maximum frequency of the CPU (hz). > > I can't resist: s/hz/Hz/ > > > + */ > > + ratio = (u64)arch_timer_get_rate() << (2 * SCHED_CAPACITY_SHIFT); > > + ratio = div64_u64(ratio, amu_work->cpuinfo_max_freq * 1000); > > Nit: we're missing a comment somewhere that the unit of this is in kHz > (which explains the * 1000). > Will do! The previous comment that explained this was ".. while ensuring max_freq is converted to HZ.", but I believed it as too clear and replaced it with the obscure "(hz)". I'll revert :). > > + this_cpu_write(arch_max_freq_scale, (unsigned long)ratio); > > + > > Okay so what we get in the tick is: > > /\ core > -------- > /\ const > > And we want that to be SCHED_CAPACITY_SCALE when running at max freq. IOW we > want to turn > > max_freq > ---------- > const_freq > > into SCHED_CAPACITY_SCALE, so we can just multiply that by: > > const_freq > ---------- * SCHED_CAPACITY_SCALE > max_freq > > But what the ratio you are storing here is > > const_freq > arch_max_freq_scale = ---------- * SCHED_CAPACITY_SCALE² > max_freq > > (because x << 2 * SCHED_CAPACITY_SHIFT == x << 20) > > > In topology_freq_scale_tick() you end up doing > > /\ core arch_max_freq_scale > ------- * -------------------- > /\ const SCHED_CAPACITY_SCALE > > which gives us what we want (SCHED_CAPACITY_SCALE at max freq). > > > Now, the reason why we multiply our ratio by the square of > SCHED_CAPACITY_SCALE was not obvious to me, but you pointed me out that the > frequency of the arch timer can be *really* low compared to the max CPU freq. > > For instance on h960: > > [ 0.000000] arch_timer: cp15 timer(s) running at 1.92MHz (phys) > > $ root@valsch-h960:~# cat /sys/devices/system/cpu/cpufreq/policy4/cpuinfo_max_freq > 2362000 > > So our ratio would be > > 1'920'000 * 1024 > ---------------- > 2'362'000'000 > > Which is ~0.83, so that becomes simply 0... > > > I had a brief look at the Arm ARM, for the arch timer it says it is > "typically in the range 1-50MHz", but then also gives an example with 20KHz > in a low-power mode. > > If we take say 5GHz max CPU frequency, our lower bound for the arch timer > (with that SCHED_CAPACITY_SCALE² trick) is about ~4.768KHz. It's not *too* > far from that 20KHz, but I'm not sure we would actually be executing stuff > in that low-power mode. > > Long story short, we're probably fine, but it would nice to shove some of > the above into comments (especially the SCHED_CAPACITY_SCALE² trick) Okay, I'll add some of this documentation as comments in the patches. I thought about doing it but I was not sure it justified the line count. But if it saves people at least the hassle to unpack this computation to understand the logic, it will be worth it. Thank you for the thorough review, Ionela.