Re: [PATCH v3] PM / QoS: Introduce new classes: DMA-Throughput and DVFS-Latency

MyungJoo Ham <myungjoo.ham@xxxxxxxxxxx> · Wed, 7 Mar 2012 18:36:00 +0900

2012/3/7 Rafael J. Wysocki <rjw@xxxxxxx>:
> Hi,
>
> Can you please post all of your outstanding PM-related patches that
> you want me to look at in one series, so that they appear in context?
>
> I'm struggling to understand what you need all those changes for.
>

Hello Rafael,

I've put the patches at

- devfreq patches (based on your pm-devfreq branch)
http://git.infradead.org/users/kmpark/linux-samsung/shortlog/refs/heads/devfreq-for-next

- pm-qos patches (based on your pm-qos branch)
http://git.infradead.org/users/kmpark/linux-samsung/shortlog/refs/heads/pm_qos-for-next

- In order to help understand, all related patches are at (do not pull
from here, please)
http://git.infradead.org/users/kmpark/linux-samsung/shortlog/refs/heads/devfreq
devfreq patches + pm-qos patches + cpufreq patches combined. (based on
Linux 3.3-rc6 + some test patches.)
However, it does not include recent PM-QoS patches in pm-qos branch,
so the patches are not recent and different from the above. So please
do not pull from here to your branches.

Anyway, we are synching our local repositories with infradead.org now;
so there could be some lag; infradead.org is not showing the current
version, yet.
(.... it even seems that the server is down now. I'll retry sync if it
resurrects)

Thank you.

Cheers!
MyungJoo.

> Thanks,
> Rafael
>
>
> On Wednesday, March 07, 2012, MyungJoo Ham wrote:
>> 1. CPU_DMA_THROUGHPUT
>>
>> This might look simliar to CPU_DMA_LATENCY. However, there are H/W
>> blocks that creates QoS requirement based on DMA throughput, not
>> latency, while their (those QoS requester H/W blocks) services are
>> short-term bursts that cannot be effectively responsed by DVFS
>> mechanisms (CPUFreq and Devfreq).
>>
>> In the Exynos4412 systems that are being tested, such H/W blocks include
>> MFC (multi-function codec)'s decoding and enconding features, TV-out
>> (including HDMI), and Cameras. When the display is operated at 60Hz,
>> each chunk of task should be done within 16ms and the workload on DMA is
>> not well spread and fluctuates between frames; some frame requires more
>> and some do not and within a frame, the workload also fluctuates
>> heavily and the tasks within a frame are usually not parallelized; they
>> are processed through specific H/W blocks, not CPU cores. They often
>> have PPMU capabilities; however, they need to be polled very frequently
>> in order to let DVFS mechanisms react properly. (less than 5ms).
>>
>> For such specific tasks, allowing them to request QoS requirements seems
>> adequete because DVFS mechanisms (as long as the polling rate is 5ms or
>> longer) cannot follow up with them. Besides, the device drivers know
>> when to request and cancel QoS exactly.
>>
>> 2. DVFS_LATENCY
>>
>> Both CPUFreq and Devfreq have response latency to a sudden workload
>> increase. With near-100% (e.g., 95%) up-threshold, the average response
>> latency is approximately 1.5 x polling-rate.
>>
>> A specific polling rate (e.g., 100ms) may generally fit for its system;
>> however, there could be exceptions for that. For example,
>> - When a user input suddenly starts: typing, clicking, moving cursors, and
>>   such, the user might need the full performance immediately. However,
>>   we do not know whether the full performance is actually needed or not
>>   until we calculate the utilization; thus, we need to calculate it
>>   faster with user inputs or any similar events. Specifying QoS on CPU
>>   processing power or Memory bandwidth at every user input is an
>>   overkill because there are many cases where such speed-up isn't
>>   necessary.
>> - When a device driver needs a faster performance response from DVFS
>>   mechanism. This could be addressed by simply putting QoS requests.
>>   However, such QoS requests may keep the system running fast
>>   unnecessary in some cases, especially if a) the device's resource
>>   usage bursts with some duration (e.g., 100ms-long bursts) and
>>   b) the driver doesn't know when such burst come. MMC/WiFi often had
>>   such behaviors although there are possibilities that part (b) might
>>   be addressed with further efforts.
>>
>> The cases shown above can be tackled with putting QoS requests on the
>> response time or latency of DVFS mechanism, which is directly related to
>> its polling interval (if the DVFS mechanism is polling based).
>>
>> Signed-off-by: MyungJoo Ham <myungjoo.ham@xxxxxxxxxxx>
>> Signed-off-by: Kyungmin Park <kyungmin.park@xxxxxxxxxxx>
>>
>> --
>> Changes from v2
>> - Rebased on the recent PM QoS patches, resolving the merge conflict.
>>
>> Changes from RFC(v1)
>> - Added omitted part (registering new classes)
>> ---
>>  include/linux/pm_qos.h |    4 ++++
>>  kernel/power/qos.c     |   31 ++++++++++++++++++++++++++++++-
>>  2 files changed, 34 insertions(+), 1 deletions(-)
>>
>> diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h
>> index c8a541e..0ee7caa 100644
>> --- a/include/linux/pm_qos.h
>> +++ b/include/linux/pm_qos.h
>> @@ -14,6 +14,8 @@ enum {
>>       PM_QOS_CPU_DMA_LATENCY,
>>       PM_QOS_NETWORK_LATENCY,
>>       PM_QOS_NETWORK_THROUGHPUT,
>> +     PM_QOS_CPU_DMA_THROUGHPUT,
>> +     PM_QOS_DVFS_RESPONSE_LATENCY,
>>
>>       /* insert new class ID */
>>       PM_QOS_NUM_CLASSES,
>> @@ -24,6 +26,8 @@ enum {
>>  #define PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE     (2000 * USEC_PER_SEC)
>>  #define PM_QOS_NETWORK_LAT_DEFAULT_VALUE     (2000 * USEC_PER_SEC)
>>  #define PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE      0
>> +#define PM_QOS_CPU_DMA_THROUGHPUT_DEFAULT_VALUE      0
>> +#define PM_QOS_DVFS_LAT_DEFAULT_VALUE        (2000 * USEC_PER_SEC)
>>  #define PM_QOS_DEV_LAT_DEFAULT_VALUE         0
>>
>>  struct pm_qos_request {
>> diff --git a/kernel/power/qos.c b/kernel/power/qos.c
>> index d6d6dbd..3e122db 100644
>> --- a/kernel/power/qos.c
>> +++ b/kernel/power/qos.c
>> @@ -101,11 +101,40 @@ static struct pm_qos_object network_throughput_pm_qos = {
>>  };
>>
>>
>> +static BLOCKING_NOTIFIER_HEAD(cpu_dma_throughput_notifier);
>> +static struct pm_qos_constraints cpu_dma_tput_constraints = {
>> +     .list = PLIST_HEAD_INIT(cpu_dma_tput_constraints.list),
>> +     .target_value = PM_QOS_CPU_DMA_THROUGHPUT_DEFAULT_VALUE,
>> +     .default_value = PM_QOS_CPU_DMA_THROUGHPUT_DEFAULT_VALUE,
>> +     .type = PM_QOS_MAX,
>> +     .notifiers = &cpu_dma_throughput_notifier,
>> +};
>> +static struct pm_qos_object cpu_dma_throughput_pm_qos = {
>> +     .constraints = &cpu_dma_tput_constraints,
>> +     .name = "cpu_dma_throughput",
>> +};
>> +
>> +
>> +static BLOCKING_NOTIFIER_HEAD(dvfs_lat_notifier);
>> +static struct pm_qos_constraints dvfs_lat_constraints = {
>> +     .list = PLIST_HEAD_INIT(dvfs_lat_constraints.list),
>> +     .target_value = PM_QOS_DVFS_LAT_DEFAULT_VALUE,
>> +     .default_value = PM_QOS_DVFS_LAT_DEFAULT_VALUE,
>> +     .type = PM_QOS_MIN,
>> +     .notifiers = &dvfs_lat_notifier,
>> +};
>> +static struct pm_qos_object dvfs_lat_pm_qos = {
>> +     .constraints = &dvfs_lat_constraints,
>> +     .name = "dvfs_latency",
>> +};
>> +
>>  static struct pm_qos_object *pm_qos_array[] = {
>>       &null_pm_qos,
>>       &cpu_dma_pm_qos,
>>       &network_lat_pm_qos,
>> -     &network_throughput_pm_qos
>> +     &network_throughput_pm_qos,
>> +     &cpu_dma_throughput_pm_qos,
>> +     &dvfs_lat_pm_qos,
>>  };
>>
>>  static ssize_t pm_qos_power_write(struct file *filp, const char __user *buf,
>>
>

-- 
MyungJoo Ham, Ph.D.
Mobile Software Platform Lab, DMC Business, Samsung Electronics
--
To unsubscribe from this list: send the line "unsubscribe linux-next" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html