On 24.10.2019 16:42, Rafael J. Wysocki wrote: > On Wed, Oct 23, 2019 at 3:33 PM Leonard Crestez <leonard.crestez@xxxxxxx> wrote: >> >> On 2019-10-23 11:54 AM, Rafael J. Wysocki wrote: >>> On Wed, Oct 23, 2019 at 4:20 AM Leonard Crestez <leonard.crestez@xxxxxxx> wrote: >>>> On 2019-10-23 1:48 AM, Rafael J. Wysocki wrote: > > [cut] > >>>>> But combining the lists of requests for all the CPUs in a policy >>>>> defeats the idea of automatic aggregation of requests which really is >>>>> what PM QoS is about. >>>> >>>> My primary interest is the "dev" part of dev_pm_qos: making pm_qos >>>> requests tied to a specific device. >>> >>> The list of requests needs to be associated with the user of the >>> effective constraint. If that is the device, it is all good. >> >> The phrase "user of the effective constraint" is somewhat unclear. > > Fair enough, so let me elaborate. > > The effective constraint (ie. the one resulting from taking all of the > requests in the relevant QoS list into account) affects the selection > of an OPP, so it is natural to associate the QoS list producing it > with a list of OPPs to select. In the cpufreq case, the policy holds > the list of OPPs and so it also should hold the corresponding QoS > lists (for the min and max frequency limits). It "uses" the effective > constraints produced by those QoS lists by preventing the OPPs out of > the between the min and max values from being selected. > > Essentially, the policy represents a power (clock/voltage) domain with > multiple components (it doesn't matter what they are at this level of > abstraction). While there can be multiple sources of QoS requests > associated with each component, all of these requests ultimately need > to be passed to the domain for aggregation, because that's where the > frequency selection decisions are made and so that's where the > effective constraint value needs to be known. Now, the natural way to > allow requests from multiple sources to be passed for aggregation is > to provide a QoS list that they can be added to. That really is what > PM QoS is for. > >> I'm using the target device as dev for dev_pm_qos, not the requestor. >> This is consistent with how it was used for cpufreq: thermal called a >> dev_pm_qos_add_request on with dev = cpu_dev not a thermal sensor or >> anything else. > > Not really, but close. :-) > > Without my series (that is 5.4-rc4, say), the cpu_cooling driver adds > its constraint to the device PM QoS of cpufreq_cdev which is a special > device created by that driver. That would be fine, except that the > cpufreq core doesn't use that QoS. It uses the device PM QoS of the > policy->cpu device instead. That is, that's where it adds its > notifiers (see cpufreq_policy_alloc()), that's where user space > requests are added (see cpufreq_online()), and (most important) that's > where the effective constraint value is read from (see > cpufreq_set_policy()). That turns out to be problematic (in addition > to the cpu_cooling driver's QoS requests going nowhere), because > confusion ensues if the current policy->cpu goes away. That behavior in cpu_cooling seems like a bug. >> However looking at other dev_pm_qos users there are instances of a >> driver calling dev_pm_qos_add_request on it's own device but this is not >> a strict requirement, correct? > > No, it isn't. > >>>>> There have to be two lists of requests per policy, one for the max and >>>>> one for the min frequency > >>>>>> If cpufreq needs a group of CPUs to run at the same frequency then it >>>>>> should deal with this by doing dev_pm_qos_read_frequency on each CPU >>>>>> device and picking a frequency that attempts to satisfy all constraints. >>>>> >>>>> No, that would be combining the requests by hand. >>>> >>>> It's just a loop though. >>> >>> Yes, it is, and needs to be run on every change of an effective >>> constraint for any CPU even if the total effective constraint doesn't >>> change. And, of course, the per-policy user space limits would need >>> to be combined with that by hand. >>> >>> Not particularly straightforward if you asked me. >> >> Well, this cpu-to-policy aggregation could also use a pm_qos_constraint >> object instead of looping. > > Yes, it could, but then somebody would need to add those > "intermediate" requests to a proper policy-level QoS and it would need > an extra notifier invocation to update each of them on a "component" > QoS change. > > This is an interesting idea in case we ever need to improve the > scalability of the QoS lists, but I'd rather use the simpler approach > for now. The advantage I see is reducing the exposure of cpufreq internals >>>>> Well, the cpufreq sysfs is per-policy and not per-CPU and we really >>>>> need a per-policy min and max frequency in cpufreq, for governors etc. >>>> >>>> Aggregation could be performed at two levels: >>>> >>>> 1) Per cpu device (by dev_pm_qos) >>>> 2) Per policy (inside cpufreq) >>>> >>>> The per-cpu dev_pm_qos notifier would just update a per-policy >>>> pm_qos_constraints object. The second step could even be done strictly >>>> inside the cpufreq core using existing pm_qos, no need to invent new >>>> frameworks. >>>> >>>> Maybe dev_pm_qos is not a very good fit for cpufreq because of these >>>> "cpu device versus cpufreq_policy" issues but it makes a ton of sense >>>> for devfreq. Can you maybe hold PATCH 3 from this series pending further >>>> discussion? >>> >>> It can be reverted at any time if need be and in 5.4 that would be dead code. >> >> I guess I can post v10 of my "devfreq pm qos" which starts by reverting >> "PATCH 3" of this series? > > You may do that, but I would consider adding a struct freq_constraints > pointer directly to struct dev_pm_info and using the new frequency QoS > helpers to manage it. > > Arguably, there is no need to bundle that with the rest of device PM > QoS and doing the above would help to avoid some code duplication too. Adding to struct dev_pm_info would increase sizeof(struct device) while dev_pm_qos only allocates memory when constraints are added. My expectation is that very few devices would even have min_freq and max_freq constraints. Maybe struct dev_pm_qos could host a "struct freq_constraints freq" instead of two separate "struct pm_qos_constraints min/max_frequency"? This way there would be two users of freq_constraints: cpufreq_policy (which is not a device) and dev_pm_qos. In the future freq_constraints might be extended to implement some logic for conflicts between min_freq and max_freq requests. -- Regards, Leonard