Re: PM related performance degradation on OMAP3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jean Pihet <jean.pihet@xxxxxxxxxxxxxx> writes:

> On Tue, May 1, 2012 at 7:27 PM, Kevin Hilman <khilman@xxxxxx> wrote:
>> Jean Pihet <jean.pihet@xxxxxxxxxxxxxx> writes:
>>
>>> HI Kevin, Grazvydas,
>>>
>>> On Tue, Apr 24, 2012 at 4:29 PM, Kevin Hilman <khilman@xxxxxx> wrote:
>>>> Jean Pihet <jean.pihet@xxxxxxxxxxxxxx> writes:
>>>>
>>>>> Hi Grazvydas, Kevin,
>>>>>
>>>>> I did some gather some performance measurements and statistics using
>>>>> custom tracepoints in __omap3_enter_idle.
>>> I posted the patches for the power domains registers cache, cf.
>>> http://marc.info/?l=linux-omap&m=133587781712039&w=2.
>>>
>>>>> All the details are at
>>>>> http://www.omappedia.org/wiki/Power_Management_Device_Latencies_Measurement#C1_performance_problem:_analysis
>>> I updated the page with the measurements results with Kevin's patches
>>> and the registers cache patches.
>>>
>>> The results are showing that:
>>> - the registers cache optimizes the low power mode transitions, but is
>>> not sufficient to obtain a big gain. A few unused domains are
>>> transitioning, which causes a big penalty in the idle path.
>>
>> PER is the one that seems to be causing the most latency.
>>
>> Can you try do your measurements using hack below which makes sure that
>> PER isn't any deeper than CORE?
>
> Indeed your patch brings significant improvements, cf. wiki page at
> http://www.omappedia.org/wiki/Power_Management_Device_Latencies_Measurement#C1_performance_problem:_analysis
> for detailed information.
> Here below is the reworked patch, more suited for inclusion in mainline [1]
>
> I have another optimisation -in proof of concept state- that brings
> another significant improvement. It is about allowing/disabling idle
> for only 1 clkdm in a pwrdm and not iterate through all the clkdms.
> This still needs some rework though. Cf. patch [2]

That should work since disabling idle for any clkdm will have the same
effect.  Can you send this as a separate patch with a descriptive
changelog.

Kevin


> Patches [1] and [2] on top of the registers cache and the
> optimisations in pre/post_transition bring the performance close to
> the performance for the non cpuidle case (3.0MB/s compared to 3.1MB/s
> on Beagleboard).
>
> What do you think?
>
> Regards,
> Jean
>
> ---
> [1]
> diff --git a/arch/arm/mach-omap2/cpuidle34xx.c
> b/arch/arm/mach-omap2/cpuidle34xx.c
> index e406d7b..572b605 100644
> +++ b/arch/arm/mach-omap2/cpuidle34xx.c
> @@ -279,32 +279,36 @@ static int omap3_enter_idle_bm(struct cpuidle_device *dev,
> 	int ret;
>
> 	/*
> -	 * Prevent idle completely if CAM is active.
> +	 * Use only C1 if CAM is active.
> 	 * CAM does not have wakeup capability in OMAP3.
> 	 */
> -	if (pwrdm_read_func_pwrst(cam_pd) == PWRDM_FUNC_PWRST_ON) {
> +	if (pwrdm_read_func_pwrst(cam_pd) == PWRDM_FUNC_PWRST_ON)
> 		new_state_idx = drv->safe_state_index;
> -		goto select_state;
> -	}
> -
> -	new_state_idx = next_valid_state(dev, drv, index);
> +	else
> +		new_state_idx = next_valid_state(dev, drv, index);
>
> -	/*
> -	 * Prevent PER off if CORE is not in retention or off as this
> -	 * would disable PER wakeups completely.
> -	 */
> +	/* Program PER state */
> 	cx = cpuidle_get_statedata(&dev->states_usage[new_state_idx]);
> 	core_next_state = cx->core_state;
> -	per_next_state = per_saved_state = pwrdm_read_next_func_pwrst(per_pd);
> -	if ((per_next_state == PWRDM_FUNC_PWRST_OFF) &&
> -	    (core_next_state > PWRDM_FUNC_PWRST_CSWR))
> -		per_next_state = PWRDM_FUNC_PWRST_CSWR;
> +	if (new_state_idx == 0) {
> +		/* In C1 do not allow PER state lower than CORE state */
> +		per_next_state = core_next_state;
> +	} else {
> +		/*
> +		 * Prevent PER off if CORE is not in RETention or OFF as this
> +		 * would disable PER wakeups completely.
> +		 */
> +		per_next_state = per_saved_state =
> +				pwrdm_read_next_func_pwrst(per_pd);
> +		if ((per_next_state == PWRDM_FUNC_PWRST_OFF) &&
> +		    (core_next_state > PWRDM_FUNC_PWRST_CSWR))
> +			per_next_state = PWRDM_FUNC_PWRST_CSWR;
> +	}
>
> 	/* Are we changing PER target state? */
> 	if (per_next_state != per_saved_state)
> 		omap_set_pwrdm_state(per_pd, per_next_state);
>
> -select_state:
> 	ret = omap3_enter_idle(dev, drv, new_state_idx);
>
> 	/* Restore original PER state if it was modified */
> @@ -390,7 +394,6 @@ int __init omap3_idle_init(void)
>
> 	/* C1 . MPU WFI + Core active */
> 	_fill_cstate(drv, 0, "MPU ON + CORE ON");
> -	(&drv->states[0])->enter = omap3_enter_idle;
> 	drv->safe_state_index = 0;
> 	cx = _fill_cstate_usage(dev, 0);
> 	cx->valid = 1;	/* C1 is always valid */
>
> [2]
> diff --git a/arch/arm/mach-omap2/cpuidle34xx.c
> b/arch/arm/mach-omap2/cpuidle34xx.c
> index e406d7b..6aa3c75 100644
> --- a/arch/arm/mach-omap2/cpuidle34xx.c
> +++ b/arch/arm/mach-omap2/cpuidle34xx.c
> @@ -118,8 +118,10 @@ static int __omap3_enter_idle(struct cpuidle_device *dev,
>
>  	/* Deny idle for C1 */
>  	if (index == 0) {
> -		pwrdm_for_each_clkdm(mpu_pd, _cpuidle_deny_idle);
> -		pwrdm_for_each_clkdm(core_pd, _cpuidle_deny_idle);
> +		//pwrdm_for_each_clkdm(mpu_pd, _cpuidle_deny_idle);
> +		clkdm_deny_idle(mpu_pd->pwrdm_clkdms[0]);
> +		//pwrdm_for_each_clkdm(core_pd, _cpuidle_deny_idle);
> +		clkdm_deny_idle(core_pd->pwrdm_clkdms[0]);
>  	}
>
>  	/*
> @@ -141,8 +143,10 @@ static int __omap3_enter_idle(struct cpuidle_device *dev,
>
>  	/* Re-allow idle for C1 */
>  	if (index == 0) {
> -		pwrdm_for_each_clkdm(mpu_pd, _cpuidle_allow_idle);
> -		pwrdm_for_each_clkdm(core_pd, _cpuidle_allow_idle);
> +		//pwrdm_for_each_clkdm(mpu_pd, _cpuidle_allow_idle);
> +		clkdm_allow_idle(mpu_pd->pwrdm_clkdms[0]);
> +		//pwrdm_for_each_clkdm(core_pd, _cpuidle_allow_idle);
> +		clkdm_allow_idle(core_pd->pwrdm_clkdms[0]);
>  	}
>
>  return_sleep_time:
>
>>
>> Kevin
>>
>> From bb2f67ed93dc83c645080e293d315d383c23c0c6 Mon Sep 17 00:00:00 2001
>> From: Kevin Hilman <khilman@xxxxxx>
>> Date: Mon, 16 Apr 2012 17:53:14 -0700
>> Subject: [PATCH] cpuidle34xx: per follows core, C1 use _bm
>>
>> ---
>>  arch/arm/mach-omap2/cpuidle34xx.c |    9 +++++----
>>  1 file changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/arm/mach-omap2/cpuidle34xx.c b/arch/arm/mach-omap2/cpuidle34xx.c
>> index 374708d..00400ad 100644
>> --- a/arch/arm/mach-omap2/cpuidle34xx.c
>> +++ b/arch/arm/mach-omap2/cpuidle34xx.c
>> @@ -278,9 +278,11 @@ static int omap3_enter_idle_bm(struct cpuidle_device *dev,
>>        cx = cpuidle_get_statedata(&dev->states_usage[index]);
>>        core_next_state = cx->core_state;
>>        per_next_state = per_saved_state = pwrdm_read_next_pwrst(per_pd);
>> -       if ((per_next_state == PWRDM_POWER_OFF) &&
>> -           (core_next_state > PWRDM_POWER_RET))
>> -               per_next_state = PWRDM_POWER_RET;
>> +       /* if ((per_next_state == PWRDM_POWER_OFF) && */
>> +       /*     (core_next_state > PWRDM_POWER_RET)) */
>> +       /*      per_next_state = PWRDM_POWER_RET; */
>> +       if (per_next_state < core_next_state)
>> +               per_next_state = core_next_state;
>>
>>        /* Are we changing PER target state? */
>>        if (per_next_state != per_saved_state)
>> @@ -374,7 +376,6 @@ int __init omap3_idle_init(void)
>>
>>        /* C1 . MPU WFI + Core active */
>>        _fill_cstate(drv, 0, "MPU ON + CORE ON");
>> -       (&drv->states[0])->enter = omap3_enter_idle;
>>        drv->safe_state_index = 0;
>>        cx = _fill_cstate_usage(dev, 0);
>>        cx->valid = 1;  /* C1 is always valid */
>> --
>> 1.7.9.2
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Arm (vger)]     [ARM Kernel]     [ARM MSM]     [Linux Tegra]     [Linux WPAN Networking]     [Linux Wireless Networking]     [Maemo Users]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux