Re: [RFT][PATCH 0/3] cpufreq / PM: QoS: Introduce frequency QoS and use it in cpufreq

"Rafael J. Wysocki" <rafael@xxxxxxxxxx> · Wed, 23 Oct 2019 10:54:31 +0200

On Wed, Oct 23, 2019 at 4:20 AM Leonard Crestez <leonard.crestez@xxxxxxx> wrote:
>
> On 2019-10-23 1:48 AM, Rafael J. Wysocki wrote:
> > On Wed, Oct 23, 2019 at 12:06 AM Leonard Crestez
> > <leonard.crestez@xxxxxxx> wrote:
> >> I've been working on a series which add DEV_PM_QOS support to devfreq,
> >> now at v9:
> >>
> >> Your third patch removes DEV_PM_QOS_FREQUENCY_MIN/MAX that my series
> >> depends upon. I found the email on patchwork, hopefully the in-reply-to
> >> header is OK?
> >>
> >> As far as I can tell the replacement ("frequency qos") needs constraints
> >> to be managed outside the device infrastructure and it's not obviously
> >> usable a generic mechanism for making "min_freq/max_freq" requests to a
> >> specific device.
> >
> > You can add a struct freq_constrants pointer to struct dev_pm_info and
> > use it just fine.  It doesn't have to be bolted into struct
> > dev_pm_qos.
>
> I'm not sure what you mean by this? min/max_freq was already available
> in dev_pm_qos so it's not clear why it would be moved somewhere else.
> What I'm looking for is a mechanism to make min/max_freq requests on a
> per-device basis and DEV_PM_QOS_MIN_FREQUENCY already did that.
>
> Reuse is good, right?

But they go away in patch 3 of this series as there are no users in
the tree.  Sorry about that.

> >> I've read a bit through your emails and it seems the problem is that
> >> you're dealing with dev_pm_qos on per-policy basis but each "struct
> >> cpufreq_policy" can cover multiple CPU devices.
> >>
> >> An alternative solution which follows dev_pm_qos would be to add
> >> notifiers for each CPU inside cpufreq_online and cpufreq_offline. This
> >> makes quite a bit of sense because each CPU is a separate "device" with
> >> a possibly distinct list of qos requests.
> >
> > But combining the lists of requests for all the CPUs in a policy
> > defeats the idea of automatic aggregation of requests which really is
> > what PM QoS is about.
>
> My primary interest is the "dev" part of dev_pm_qos: making pm_qos
> requests tied to a specific device.

The list of requests needs to be associated with the user of the
effective constraint.  If that is the device, it is all good.

> > There have to be two lists of requests per policy, one for the max and
> > one for the min frequency >
> >> If cpufreq needs a group of CPUs to run at the same frequency then it
> >> should deal with this by doing dev_pm_qos_read_frequency on each CPU
> >> device and picking a frequency that attempts to satisfy all constraints.
> >
> > No, that would be combining the requests by hand.
>
> It's just a loop though.

Yes, it is, and needs to be run on every change of an effective
constraint for any CPU even if the total effective constraint doesn't
change.  And, of course, the per-policy user space limits would need
to be combined with that by hand.

Not particularly straightforward if you asked me.

Not to mention the fact that, say, cpu_cooling, has a per-policy list
of requests anyway.

> >> Handling sysfs min/max_freq through dev_pm_qos would be of dubious
> >> value, though I guess you could register identical requests for each CPU.
> >>
> >> I'm not familiar with what you're trying to accomplish with PM_QOS other
> >> than replace the sysfs min_freq/max_freq files:
> >
> > QoS-based management of the frequency limits is not really needed for
> > that.  The real motivation for adding it were things like thermal and
> > platform firmware induced limits that all have their own values to
> > combine with the ones provided by user space.
>
> Current users seem to be thermal-related. Do you care about min/max_freq
> requests from stuff not directly tied to a CPU?

Yes, I do.

And they will need to add requests per policy.

> >> What I want to do is add
> >> a driver using the interconnect driver which translates requests for
> >> "bandwidth-on-a-path" into "frequency-on-a-device". More specifically a
> >> display driver could request bandwidth to RAM and this would be
> >> translated into min frequency for NoC and the DDR controller, both of
> >> which implement scaling via devfreq:
> >>
> >> This is part of an effort to upstream an out-of-tree "busfreq" feature
> >> which allows device device to make "min frequency requests" through an
> >> entirely out-of-tree mechanism. It would also allow finer-grained
> >> scaling that what IMX tree currently support.
> >>
> >> If you're making cpufreq qos constrains be "per-cpufreq-policy" then
> >> it's not clear how you would handle in-kernel constraints from other
> >> subsystems. Would users have to get a pointer to struct cpufreq_policy
> >> and struct freq_constraints?
> >
> > Yes.
> >
> >> That would make object lifetime a nightmare!
> >
> > Why really?  It is not much different from the device PM QoS case
>  >> Actually,  is a simple
> > one-for-one replacement of the former.  As it turns out, all of its
> > users have access to a policy object anyway already.
>
> All current users are very closely tied to cpufreq, what I had in mind
> is requests from unrelated subsystems.

You can use cpufreq policy notifiers for that.  Add a request for each
CPU in the policy (or for each related CPU if that is needed) to
policy->constraints on CREATE_POLICY and remove them on REMOVE_POLICY.
That's all you need to do.

BTW, the original code from Viresh did that already, I haven't changed
it.  And it didn't have per-CPU lists of frequency requests for that
matter, it used the ones in policy->cpu as the per-policy lists, which
doesn't work.

> Browsing through the cpufreq core it seems that it's possible for a
> struct cpufreq_policy to be created and destroyed at various points, the
> simplest example being rmmod/modprobe on a cpufreq driver.
>
> The freq_qos_add_request function grabs a pointer to struct
> freq_constraints, this can become invalid when cpufreq_policy is freed.
>
> I guess all users need to register a CPUFREQ_POLICY_NOTIFIER and make
> sure to freq_qos_add_request every time?

Yes.

The policy is the user of the effective constraint anyway and holding
on to a list of requests without a user of the effective constraint
would be, well, not useful.

> Looking at your [PATCH 2/3] I  can't spot any obvious issue, thermal clamping
> code seems to get the appropriate callbacks.
>
> >> But dev_pm_qos solves this by tying to struct device.
>
> The lifetime of "struct device" is already controlled by
> get_device/put_device.

And why does this matter here?

> > Well, the cpufreq sysfs is per-policy and not per-CPU and we really
> > need a per-policy min and max frequency in cpufreq, for governors etc.
>
> Aggregation could be performed at two levels:
>
> 1) Per cpu device (by dev_pm_qos)
> 2) Per policy (inside cpufreq)
>
> The per-cpu dev_pm_qos notifier would just update a per-policy
> pm_qos_constraints object. The second step could even be done strictly
> inside the cpufreq core using existing pm_qos, no need to invent new
> frameworks.
>
> Maybe dev_pm_qos is not a very good fit for cpufreq because of these
> "cpu device versus cpufreq_policy" issues but it makes a ton of sense
> for devfreq. Can you maybe hold PATCH 3 from this series pending further
> discussion?

It can be reverted at any time if need be and in 5.4 that would be dead code.