Hello. On Sat, Jul 30, 2011 at 10:02 AM, Turquette, Mike <mturquette@xxxxxx> wrote: > On Fri, Jul 29, 2011 at 2:10 AM, Rafael J. Wysocki <rjw@xxxxxxx> wrote: >> On Friday, July 29, 2011, Turquette, Mike wrote: >>> On Thu, Jul 28, 2011 at 3:10 PM, Rafael J. Wysocki <rjw@xxxxxxx> wrote: >>> > On Friday, July 15, 2011, MyungJoo Ham wrote: >>> >> For a usage example, please look at >>> >> http://git.infradead.org/users/kmpark/linux-2.6-samsung/shortlog/refs/heads/devfreq >>> >> >>> >> In the above git tree, DVFS (dynamic voltage and frequency scaling) mechanism >>> >> is applied to the memory bus of Exynos4210 for Exynos4210-NURI boards. >>> >> In the example, the LPDDR2 DRAM frequency changes between 133, 266, and 400MHz >>> >> and other related clocks simply follow the determined DDR RAM clock. >>> >> >>> >> The DEVFREQ driver for Exynos4210 memory bus is at >>> >> /arch/arm/mach-exynos4/devfreq_bus.c in the git tree. >>> >> >>> >> MyungJoo Ham (3): >>> >> PM: Introduce DEVFREQ: generic DVFS framework with device-specific >>> >> OPPs >>> >> PM / DEVFREQ: add example governors >>> >> PM / DEVFREQ: add sysfs interface (including user tickling) >>> > >>> > OK, I'm going to take the patches for 3.2. >>> >>> Have any other platforms signed up to use this mechanism to manage >>> their peripheral DVFS? >> >> Not that I know of, but one initial user is sufficient for me. >> So if you have anything _against_ the patches, please speak up. > > I do have some concerns. Let me start by saying that I'm defining a > "governor" as some active piece of executing code, probably a looping > workqueue that inspects activity/idleness of a device and then makes a > determination regarding clock frequency. > > devfreq seems to be good framework for creating DVFS governors. > However I think that most scalable devices on an SoC do *not* need a > governor, and many scalable devices won't have performance counters or > any other way to implement such introspection. Yes, governors except for some static or userspace-driven ones (such as "performance", "powersave", and "userspace" although "userspace" is not implemented for devfreq yet), they loop workqueue that inspects activity/idleness of a device and determines frequency. However, the inspection is done with a callback provided by each device, not done directly by the devfreq itself. Therefore, if there is any way to measure the activities (not just performance counters, number of requests/function calls should be fine for may cases), normal governors like "simple-ondemand" will work. > Some examples include a MMC controller, which might change its clock > rate depending on the class of card that the user has inserted. Or > even a "smartish" device like a GPU lacking performance counters; it's > driver will ramp up frequency when there is work to be done and kick > off a timeout. If no new work comes in before the timeout then the > driver will drop the frequency. In the "simple MMC controller w/o performance counter" case, there are following ways to use devfreq even if using the number of requests or functions calls is not possible. Method 1) use "userspace" governor and let user process choose frequency based on the class Method 2) use any "reasonable" governor and let the device driver set only "valid" frequencies enabled. For a rough example, we may do if class < 6, disable freq > 40MHz, class < 10, disable freq > 80MHz, and so on. If we do not have performance counters or any other mechanisms to monitor the activities, "performance" governor along with clock-gated MMC driver will save enough power. For GPUs without anything to monitor the activities, we may do the same as the MMC case. However, with the H/W I've got now, (Exynos4210), we have performance counters (PPMU) for many blocks: 3D(MALI GPU), ACP, CAMIF, CPU, DMC0, DMC1 (memory controllers), FSYS, IMAGE, LCD0, LCD1, MFC_L, MFC_R, TV, LEFT_BUS, and RIGHT_BUS. I don't think Exynos4 is an exceptionally fancy SoC (already millions are sold for phones) and other mobile SoCs (at least for flagship models) will have them very soon (or already have them). Along with this patch, in the example with git branch link, we control DMC0/DMC1 blocks. And, > A governor is not required in these cases (as they are event driven) > and devfreq is quite heavyweight for such applications. What is > needed is a QoS-style software layer that allows throughput requests > to be made from an initiator device towards a target device. This > layer should aggregate requests since many initator devices may make > requests to the same target device. This layer I'm describing, which > does not exist today, should be where the actual DVFS transition takes > place. That could take the form of a clk_set_rate call in the clock > framework (as described by Colin in V1 of this series), or some other > not-yet-realized dvfs_set_opp ,or something like Jean Pihet's > per-device PM QoS patches or whatever. For the purposes of this email > I don't really care which framework implements the QoS request > aggregation. Such aggregation could be also done with governors. If the governor-device pair does not want to poll devfreq wouldn't loop unless there is any governor-device pair that wants to do so. If it is event-driven, users may just "allow/disallow" frequencies with OPP framework and devfreq will choose proper frequency with the given governor for the device. If every device uses "static" or "event-driven" governors such as powersave/performance/userspace, there will be no polling/looping. When it is going to be directly controlled by userspace, we'll need a "userspace" governor (same with userspace governor of cpufreq). If there is a QoS request for a devfreq-ed device, the request could be done with OPP's frequency enable/disable. If a device is to be executed at 400MHz or faster, all frequencies under 400MHz could be simply disabled w/ OPP. Devfreq governors cannot override such frequency enable/disable configurations. However, if such QoS requests need delays (timers) like tickle, a generalized tickle supplied with frequency or percent of max-frequency might work. (i.e., tickle(dev, freuqency, duration); ) Then, this generalized tickle will hold at the request frequency or higher by disabling lower frequencies temporarily. > > The point of describing this non-existant API is that devfreq should > really be just another input into it. A governor that can measure bus > saturation is really cool, but it may not yield optimal results > compared to several drivers which make QoS-style requests and insure > that performance is guaranteed for their particular needs during their > transactions. The good news is that we don't have to choose between > performance counter introspection and software QoS requests: both the > driver requests and the governor should all feed as inputs into the > QoS-style DVFS mechanism. > > Taking that logic to its inevitable conclusion, tickle doesn't belong > inside the governor at all. If some device X wants to ramp up the > frequency of device Y, it should just make a QoS-style throughput > request towards device Y, possibly with a timeout (keeping the > original idea of tickle intact). This is entirely a separate idea > from a governor's introspective workqueue loop. Although tickle is sharing the same loop with governors, tickle does not belong inside governors. Tickle overrides the decisions of governors; governor's decision function is not called if the device is being tickled. However, generalizing the tickle function so that it may take "at least at xx % of max frequency" or "operate at least xx khz" as an option seems reasonable for QoS requests. And such options might be implemented for next version of devfreq later. This requires modification in tickle function interface or adding another interface for tickle function. However, if such QoS requests do not need duration set, we can just go with OPP's frequency enable/disable and disable lower-than-QoS-requirement frequencies. Thus, I guess this QoS issue is somewhat not very significant for devfreq. And it can be easily mitigated by adding another interface or modifying the interface of tickle function. > > For userspace, a sysfs entry for tickle would also not feed into the > governor, but some dummy struct device *user would probably be the > initiator device and it would simply call the QoS-style throughput > API. > > In summary my objections to this series are: > 1) devfreq should not be the *final* software layer to invoke a DVFS > transition as it has not taken all constraints into account. > 2) a devfreq governor represents just one constraint out of many to be > considered for any given scalable device. If the concern is about the QoS requests, I guess generalizing tickle would be sufficient as above. For devices without performance counters and any other mechanisms to infer the usage statistics, "performance" governor with event-driven OPP freq-enable/disable should be fine. > > My objection to these patches getting merged is that I think they are > a bit ahead of their time. We need to know what the real DVFS API > looks like underneath devfreq first, since devfreq should really be > built on top of it. > > Regards, > Mike > >> Thanks, >> Rafael >> > _______________________________________________ > linux-pm mailing list > linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx > https://lists.linux-foundation.org/mailman/listinfo/linux-pm > Cheers! MyungJoo. -- MyungJoo Ham (함명주), Ph.D. Mobile Software Platform Lab, Digital Media and Communications (DMC) Business Samsung Electronics cell: 82-10-6714-2858 _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/linux-pm