Now, the request to change the frequency starts from cpufreq governors, like schedutil when they calls: __cpufreq_driver_target(policy, 599 MHz, CPUFREQ_RELATION_L); CPUFREQ_RELATION_L means: lowest frequency at or above target. And so I would expect the frequency to get set to 600MHz (if we look at clock driver) or 700MHz (if we look at OPP table). I think we should decide this thing from the OPP table only as that's what the platform guys want us to use. So, we should end up with 700 MHz. Then we land into dev_pm_opp_set_rate(), which does this (which is code copied from earlier version of cpufreq-dt driver):
so before we land into dev_pm_opp_set_rate() from a __cpufreq_driver_target() I guess we do have a cpufreq driver callback that gets called in between? which is either .target_index or .target In case of .target_index, the cpufreq core looks for a OPP index and we would land up with 700Mhz i guess, so we are good. In case of .target though the 'relation' CPUFREQ_RELATION_L does get passed over to the cpufreq driver which I am guessing is expected to handle it in some way to make sure the target frequency set is not less than whats requested? instead of simply passing the requested frequency over to dev_pm_opp_set_rate()? Looking at all the existing cpufreq drivers upstream, while most support .target_index the 3 which do support .target seem to completely ignore this 'relation' input that's passed to them. drivers/cpufreq/cppc_cpufreq.c: .target = cppc_cpufreq_set_target, drivers/cpufreq/cpufreq-nforce2.c: .target = nforce2_target, drivers/cpufreq/pcc-cpufreq.c: .target = pcc_cpufreq_target,
This kind of behavior (introduced by this patch) is important for other devices which want to run at the nearest frequency to target one, but not for CPUs/GPUs. So, we need to tag these IO devices separately, maybe from DT ? So we select the closest match instead of most optimal one.
yes we do need some way to distinguish between CPU/GPU devices and other IO devices. CPU/GPU can always run at fmax for a given voltage, that's not true for IO devices and I don't see how we can satisfy both cases without clearly knowing if we are serving a processor or an IO device, unless the higher layers (cpufreq/devfreq) are able to handle this somehow without expecting the OPP layer to handle the differences. -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation