Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq

Leonard Crestez <leonard.crestez@xxxxxxx> · Wed, 23 Oct 2019 02:20:10 +0000

On 2019-10-23 1:48 AM, Rafael J. Wysocki wrote:
> On Wed, Oct 23, 2019 at 12:06 AM Leonard Crestez
> <leonard.crestez@xxxxxxx> wrote:
>> I've been working on a series which add DEV_PM_QOS support to devfreq,
>> now at v9:
>>
>> Your third patch removes DEV_PM_QOS_FREQUENCY_MIN/MAX that my series
>> depends upon. I found the email on patchwork, hopefully the in-reply-to
>> header is OK?
>>
>> As far as I can tell the replacement ("frequency qos") needs constraints
>> to be managed outside the device infrastructure and it's not obviously
>> usable a generic mechanism for making "min_freq/max_freq" requests to a
>> specific device.
> 
> You can add a struct freq_constrants pointer to struct dev_pm_info and
> use it just fine.  It doesn't have to be bolted into struct
> dev_pm_qos.

I'm not sure what you mean by this? min/max_freq was already available 
in dev_pm_qos so it's not clear why it would be moved somewhere else. 
What I'm looking for is a mechanism to make min/max_freq requests on a 
per-device basis and DEV_PM_QOS_MIN_FREQUENCY already did that.

Reuse is good, right?

>> I've read a bit through your emails and it seems the problem is that
>> you're dealing with dev_pm_qos on per-policy basis but each "struct
>> cpufreq_policy" can cover multiple CPU devices.
>>
>> An alternative solution which follows dev_pm_qos would be to add
>> notifiers for each CPU inside cpufreq_online and cpufreq_offline. This
>> makes quite a bit of sense because each CPU is a separate "device" with
>> a possibly distinct list of qos requests.
> 
> But combining the lists of requests for all the CPUs in a policy
> defeats the idea of automatic aggregation of requests which really is
> what PM QoS is about.

My primary interest is the "dev" part of dev_pm_qos: making pm_qos 
requests tied to a specific device.

> There have to be two lists of requests per policy, one for the max and
> one for the min frequency >
>> If cpufreq needs a group of CPUs to run at the same frequency then it
>> should deal with this by doing dev_pm_qos_read_frequency on each CPU
>> device and picking a frequency that attempts to satisfy all constraints.
> 
> No, that would be combining the requests by hand.

It's just a loop though.

>> Handling sysfs min/max_freq through dev_pm_qos would be of dubious
>> value, though I guess you could register identical requests for each CPU.
>>
>> I'm not familiar with what you're trying to accomplish with PM_QOS other
>> than replace the sysfs min_freq/max_freq files:
> 
> QoS-based management of the frequency limits is not really needed for
> that.  The real motivation for adding it were things like thermal and
> platform firmware induced limits that all have their own values to
> combine with the ones provided by user space.

Current users seem to be thermal-related. Do you care about min/max_freq 
requests from stuff not directly tied to a CPU?

>> What I want to do is add
>> a driver using the interconnect driver which translates requests for
>> "bandwidth-on-a-path" into "frequency-on-a-device". More specifically a
>> display driver could request bandwidth to RAM and this would be
>> translated into min frequency for NoC and the DDR controller, both of
>> which implement scaling via devfreq:
>>
>> This is part of an effort to upstream an out-of-tree "busfreq" feature
>> which allows device device to make "min frequency requests" through an
>> entirely out-of-tree mechanism. It would also allow finer-grained
>> scaling that what IMX tree currently support.
>>
>> If you're making cpufreq qos constrains be "per-cpufreq-policy" then
>> it's not clear how you would handle in-kernel constraints from other
>> subsystems. Would users have to get a pointer to struct cpufreq_policy
>> and struct freq_constraints?
> 
> Yes.
> 
>> That would make object lifetime a nightmare!
> 
> Why really?  It is not much different from the device PM QoS case
 >> Actually,  is a simple
> one-for-one replacement of the former.  As it turns out, all of its
> users have access to a policy object anyway already.

All current users are very closely tied to cpufreq, what I had in mind 
is requests from unrelated subsystems.

Browsing through the cpufreq core it seems that it's possible for a 
struct cpufreq_policy to be created and destroyed at various points, the 
simplest example being rmmod/modprobe on a cpufreq driver.

The freq_qos_add_request function grabs a pointer to struct 
freq_constraints, this can become invalid when cpufreq_policy is freed.

I guess all users need to register a CPUFREQ_POLICY_NOTIFIER and make 
sure to freq_qos_add_request every time? Looking at your [PATCH 2/3] I 
can't spot any obvious issue, thermal clamping code seems to get the 
appropriate callbacks.
>> But dev_pm_qos solves this by tying to struct device.

The lifetime of "struct device" is already controlled by 
get_device/put_device.

> Well, the cpufreq sysfs is per-policy and not per-CPU and we really
> need a per-policy min and max frequency in cpufreq, for governors etc.

Aggregation could be performed at two levels:

1) Per cpu device (by dev_pm_qos)
2) Per policy (inside cpufreq)

The per-cpu dev_pm_qos notifier would just update a per-policy 
pm_qos_constraints object. The second step could even be done strictly 
inside the cpufreq core using existing pm_qos, no need to invent new 
frameworks.

Maybe dev_pm_qos is not a very good fit for cpufreq because of these 
"cpu device versus cpufreq_policy" issues but it makes a ton of sense 
for devfreq. Can you maybe hold PATCH 3 from this series pending further 
discussion?

--
Regards,
Leonard