RE: [PATCH] OMAP CPUIDLE: CPU Idle latency measurement

"Sripathy, Vishwanath" <vishwanath.bs@xxxxxx> · Tue, 31 Aug 2010 14:39:16 +0530



> -----Original Message-----
> From: saileshcv@xxxxxxxxx [mailto:saileshcv@xxxxxxxxx] On Behalf Of C V, Silesh
> Sent: Tuesday, August 31, 2010 12:27 PM
> To: Sripathy, Vishwanath
> Cc: Kevin Hilman; vishwanath.sripathy@xxxxxxxxxx; linux-omap@xxxxxxxxxxxxxxx;
> linaro-dev@xxxxxxxxxxxxxxxx
> Subject: Re: [PATCH] OMAP CPUIDLE: CPU Idle latency measurement
> 
> On Tue, Aug 31, 2010 at 10:28 AM, Sripathy, Vishwanath
> <vishwanath.bs@xxxxxx> wrote:
> >
> >
> >> -----Original Message-----
> >> From: Silesh C V [mailto:saileshcv@xxxxxxxxx]
> >> Sent: Tuesday, August 31, 2010 9:53 AM
> >> To: Sripathy, Vishwanath
> >> Cc: Kevin Hilman; vishwanath.sripathy@xxxxxxxxxx; linux-omap@xxxxxxxxxxxxxxx;
> >> linaro-dev@xxxxxxxxxxxxxxxx
> >> Subject: Re: [PATCH] OMAP CPUIDLE: CPU Idle latency measurement
> >>
> >> Hi Vishwa,
> >>
> >> On Mon, Aug 30, 2010 at 6:29 PM, Sripathy, Vishwanath
> >> <vishwanath.bs@xxxxxx> wrote:
> >> > Kevin,
> >> >
> >> >> -----Original Message-----
> >> >> From: linux-omap-owner@xxxxxxxxxxxxxxx [mailto:linux-omap-
> >> >> owner@xxxxxxxxxxxxxxx] On Behalf Of Kevin Hilman
> >> >> Sent: Saturday, August 28, 2010 12:45 AM
> >> >> To: vishwanath.sripathy@xxxxxxxxxx
> >> >> Cc: linux-omap@xxxxxxxxxxxxxxx; linaro-dev@xxxxxxxxxxxxxxxx
> >> >> Subject: Re: [PATCH] OMAP CPUIDLE: CPU Idle latency measurement
> >> >>
> >> >> vishwanath.sripathy@xxxxxxxxxx writes:
> >> >>
> >> >> > From: Vishwanath BS <vishwanath.sripathy@xxxxxxxxxx>
> >> >> >
> >> >> > This patch has instrumentation code for measuring latencies for
> >> >> > various CPUIdle C states for OMAP. Idea here is to capture the
> >> >> > timestamp at various phases of CPU Idle and then compute the sw
> >> >> > latency for various c states.  For OMAP, 32k clock is chosen as
> >> >> > reference clock this as is an always on clock.  wkup domain memory
> >> >> > (scratchpad memory) is used for storing timestamps.  One can see the
> >> >> > worstcase latencies in below sysfs entries (after enabling
> >> >> > CONFIG_CPU_IDLE_PROF in .config). This information can be used to
> >> >> > correctly configure cpu idle latencies for various C states after
> >> >> > adding HW latencies for each of these sw latencies.
> >> >> > /sys/devices/system/cpu/cpu0/cpuidle/state<n>/actual_latency
> >> >> > /sys/devices/system/cpu/cpu0/cpuidle/state<n>/sleep_latency
> >> >> > /sys/devices/system/cpu/cpu0/cpuidle/state<n>/wkup_latency
> >> >> >
> >> >> > THis patch is tested on OMAP ZOOM3 using kevin's pm branch.
> >> >> >
> >> >> > Signed-off-by: Vishwanath BS <vishwanath.sripathy@xxxxxxxxxx>
> >> >> > Cc: linaro-dev@xxxxxxxxxxxxxxxx
> >> >>
> >> >> While I have many problems with the implementation details, I won't go
> >> >> into them because in general this is the wrong direction for kernel
> >> >> instrumentation.
> >> >>
> >> >> This approach adds quite a bit overhead to the idle path itself.  With
> >> >> all the reads/writes from/to the scratchpad(?) and all the multiplications
> >> >> and divides in every idle path, as well as the wait-for-idlest in both
> >> >> the sleep and resume paths.  The additional overhead added is non trivial.
> >> >>
> >> >> Basically, I'd like get away from custom instrumentation and measurement
> >> >> coded inside the kernel itself.  This kind of code never stops growing
> >> >> and morphing into ugliness, and rarely scales well when new SoCs are
> >> >> added.
> >> >>
> >> >> With ftrace/perf, we can add tracepoints at specific points and use
> >> >> external tools to extract and analyze the delays, latencys etc.
> >> >>
> >> >> The point is to keep the minimum possible in the kernel: just the
> >> >> tracepoints we're interested in.   The rest (calculations, averages,
> >> >> analysis, etc.) does not need to be in the kernel and can be done easier
> >> >> and with more powerful tools outside the kernel.
> >> > The challenge here is that we need to take time stamp at the fag end of CPU
> Idle
> >> which means we have no access to DDR, MMU/Caches are disabled etc (on
> OMAP3).
> >> So I am not sure if we will be able to use ftrace/perf kind of tools here. If we
> choose
> >> to exclude assembly code part for measurement, then we will be omitting major
> >> contributor to CPU Idle latency namely ARM context save/restoration part.
> >> >
> >> > Also these calculations are done only when we enable CPUIDLE profiling feature.
> >> > In the normal production system, these will not come into picture at all. So I
> am
> >> not sure latencies involved in these calculations are still an issue >when we are
> just
> >> doing profiling.
> >>
> >>
> >> There are two other issues when we use 32k timer for latency measurement.
> >>
> >> <snip>
> >> +
> >> +       /* take care of overflow */
> >> +       if (postidle_time < preidle_time)
> >> +               postidle_time += (u32) 0xffffffff;
> >> +       if (wkup_time < sleep_time)
> >> +               wkup_time += (u32) 0xffffffff;
> >> +
> >> <snip>
> >>
> >> 1.We are checking postidle_time < preidle_time to find out whether
> >> there had been an
> >>    over flow or not. There can be situations in which the timer
> >> overflows and still we have
> >>    a greater postidle_time.
> >>
> >> 2. We are doing the correction for one overflow. What happens if the
> >> timer overflows for
> >>    a second or third time. Can we keep track of the number of
> >> overflows and then do the
> >>    correction accordingly?
> >
> > Unfortunately, there is no way to check if overflow happens more than once in 32k
> timer and as you said, theoretically it's possible >that if timer overflows more than
> once, these calculation will wrong. Having said that, do you really see any usecase
> where system >will idle for more than 37 hours in single cpuidle execution to cause
> timer overflow?
> 
> 
> I am not sure. But can we completely write off such a possibility ?
I do not think it's a possibility. Also I believe that this problem is applicable even for system time as it uses 32k clock for maintaining system time. 

Vishwa
> 
> 
> Regards,
> Silesh
> 
> >
> > Vishwa
> >>
> >> Regards,
> >> Silesh
> >>
> >> >
> >> > Regards
> >> > Vishwa
> >> >>
> >> >> Kevin
> >> >>
> >> >> --
> >> >> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
> >> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> > --
> >> > To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> >> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> >> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >
> >
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html