Quoting Mason (2015-02-05 19:13:07) > Hello everyone, > > I've been reading about related sub-systems (cpuidle and suspend) > and I'm not sure I understand how they relate / interact. > > If I understand correctly (please do point out any misconceptions) > on ARM Cortex A9, the first level of power saving is WFI, which is > typically called from the idle loop. > > This places the core in low-power mode ("Standby mode" in ARM docs). > "RAM arrays" (don't know what they are), "processor logic", and > "data engine" (not sure what any of these exactly refers to, guess > I have more reading to do) are still powered-up, but most of the > clocks are disabled. > > In ARM's exact words, "WFI and WFE Standby modes disable most of the > clocks in a processor, while keeping its logic powered up. This reduces > the power drawn to the static leakage current, leaving a tiny clock > power overhead requirement to enable the device to wake up." > > Some CPUs like Intel's have several levels of sleep (deeper levels > mean less power, but have a higher wake-up latency). AFAIU, cpuidle > is used to describe and manage these levels? (warning: over-simplification below. Please be kind if you decide to blow it up) Hello Mason, A critical thing to understand is that you are talking about two classes of idle behavior or power-saving behavior. First there are the physical idle states and low-power states that the *hardware* (silicon) can achieve. These vary in how much power they save with trade-offs such increased wake-up latency, loss of context/cache, etc. WFI is the gateway to low-power states in ARM hardware (from the perspective of the Linux kernel). A plain WFI without any extra steps will gate the CPUs clocks. With some extra steps (programming target power domain state, etc) then WFI can trigger lower voltages supplied by the PMIC/regulators, or total power gating for power domains/island resulting in increased energy savings but costlier wake-up time and loss of context. The second behavior is what the *software* (Linux OS) tries to do to save power. You mentioned two such behaviors above: 1) CPUidle tries to save power by programming the hardware to a low power idle state (see above) during moments of idleness. What is idle time? It is when no work is scheduled to be run Right Now and the scheduler enters the idle thread/loop. Note that CPUidle does not aim to affect the schedulability (new word!) of the Linux scheduler. E.g. it ideally should not impact performance, as it is only going to target a low power hardware idle state opportunistically based on naturally occurring idle time from the scheduler. 2) Suspend is very different from CPUidle. It *forces* idleness upon the OS until a wake-up event resumes the OS from suspend. Imagine closing the lid on your laptop while it is running. That is suspend. Processes are frozen regardless of whether we have lots of work scheduled or not. Suspend forces the OS to be idle. Typically this software idleness corresponds to the deepest hardware idle state, but it doesn't have to. That last point is why it is important to understand the different between idling in software and idling in hardware. More on that below. > > Isn't suspend somewhat like the deepest level of sleep? > (Or is it different in that things like RAM state are only a concern > for suspend, not cpuidle?) > There is nothing stopping a platform from suspending to RAM and leaving everything powered up and only clock gating the CPUs with a WFI. That is a brain-dead thing to do but it is possible and illustrates the separation of software and hardware idling. Regarding CPUidle, if you predict that you will be idle for a long enough period of time then it is perfectly valid for you to hit your deepest sleep state in the CPUidle path. OMAP3 did this quite well: it had a CHIP OFF state that was utilized both by suspend/hibernate as well CPUidle (when CPUidle thought that there was sufficient idle time to go to that state without adversely affecting performance). However, these days it does seem more common for suspend to target a deeper hardware sleep state than the deepest possible CPUidle state for a given platform. Finally, the idle states (C-states) available to CPUidle drivers in the mainline Linux kernel are often a poor representation of what the hardware can really do to save power. Look at vendor git tree for whatever platform you are hacking on and usually you will see that there are lots more C-states in those trees than what is merged upstream. Maybe we'll fix that problem some day. Regards, Mike > Are both subsystems still actively used? > > I saw plans to merge cpufreq into cpuidle / scheduler decisions. > > LCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler > http://www.slideshare.net/linaroorg/lca14-306-cpuidlecpufreqintegrationwithscheduler > > This presentation doesn't mention suspend, I think. > > ARM has a mode called "Dormant Mode". Is suspend typically > used to put the SoC in that mode? > > I think I need to read this document carefully: > Power Management In The Linux Kernel -- Current Status And Future > http://events.linuxfoundation.org/sites/events/files/slides/kernel_PM_plain.pdf > > There's also an older document that may prove insightful: > CPUIdle versus Suspend > http://www.linuxplumbersconf.org/2010/ocw/proposals/789 > > But things move so fast in kernel-land, that I don't know how relevant > a 4 year-old document can be. > > Regards. > -- To unsubscribe from this list: send the line "unsubscribe cpufreq" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html