On Wed, May 19, 2010 at 8:55 AM, Paul Walmsley <paul@xxxxxxxxx> wrote: > Hi Arve, > > On Mon, 17 May 2010, Arve Hjønnevåg wrote: > >> On Mon, May 17, 2010 at 8:34 PM, Paul Walmsley <paul@xxxxxxxxx> wrote: >> > On Mon, 17 May 2010, Arve Hjønnevåg wrote: >> >> On Mon, May 17, 2010 at 7:17 PM, Paul Walmsley <paul@xxxxxxxxx> wrote: >> >> > On Fri, 14 May 2010, Arve Hjønnevåg wrote: >> >> >> On Thu, May 13, 2010 at 11:27 PM, Paul Walmsley <paul@xxxxxxxxx> wrote: >> >> >> > On Thu, 13 May 2010, Arve Hjønnevåg wrote: >> >> >> > >> >> >> >> Adds /sys/power/policy that selects the behaviour of /sys/power/state. >> >> >> >> After setting the policy to opportunistic, writes to /sys/power/state >> >> >> >> become non-blocking requests that specify which suspend state to enter >> >> >> >> when no suspend blockers are active. A special state, "on", stops the >> >> >> >> process by activating the "main" suspend blocker. >> >> >> >> >> >> >> >> Signed-off-by: Arve Hjønnevåg <arve@xxxxxxxxxxx> >> >> >> > >> >> >> > Looking at the way that suspend-blocks are used in the current Android >> >> >> > 'msm' kernel tree[1], they seem likely to cause layering violations, >> >> >> > fragile kernel code with userspace dependencies, and code that consumes >> >> >> > power unnecessarily. >> >> >> > >> >> >> > For example, in drivers/mmc/core/core.c:721, in mmc_rescan() [2], we find >> >> >> > the following code: >> >> >> > >> >> >> > /* give userspace some time to react */ >> >> >> > wake_lock_timeout(&mmc_delayed_work_wake_lock, HZ / 2); >> >> >> > >> >> >> > This is a layering violation. The MMC subsystem should have nothing to do >> >> >> > with "giving userspace time to react." That is the responsibility of the >> >> >> > Linux scheduler. >> >> >> > >> >> >> > This code is also intrinsically fragile and use-case dependent. What if >> >> >> > userspace occasionally needs more than (HZ / 2) to react? If the >> >> >> > distributor is lucky enough to catch this before the product ships, then >> >> >> > the distributor can patch in a new magic number. But if the device makes >> >> >> > it to the consumer, the result is an unstable device that unpredictably >> >> >> > suspends. >> >> >> > >> >> >> > The above code will also waste power. Even if userspace takes less than >> >> >> > (HZ / 2) to react, the system will still be prevented from entering a >> >> >> > system-wide low power state for the duration of the remaining time. This >> >> >> > is in contrast to an approach that uses the idle loop to enter a >> >> >> > system-wide low power state. The moment that the system has no further >> >> >> > work to do, it can start saving power. >> >> >> > >> >> >> >> >> >> Yes, preventing suspend for 1/2 second wastes some power, but are you >> >> >> seriously suggesting that it wastes more power then never suspending? >> >> >> Opportunitic suspend does not prevent entering low power modes from >> >> >> idle. >> >> > >> >> > Sorry, my E-mail was unclear. Here is my intended point: >> >> > >> >> > The current Android opportunistic suspend governor does not enter full >> >> > system suspend until all suspend-blocks are released. If the governor >> >> > were modified to respect the scheduler, then it could enter suspend the >> >> > moment that the system had no further work to do. >> >> > >> >> >> >> On the hardware that shipped we enter the same power state from idle >> >> and suspend, so the only power savings we get from suspend that we >> >> don't get in idle is from not respecting the scheduler and timers. >> > >> > Other people >> > say that their hardware XXX. (This is probably likely for anything that >> > supports ACPI.) > > Heh, looks like this part of my E-mail escaped a little too early. > >> >> > So during part of the (HZ / 2) time interval in the above code, the system >> >> > is running some kernel or userspace code. But after that work is done, an >> >> > idle-loop-based opportunistic suspend governor could enter idle during >> >> > the remaining portion of that (HZ / 2) time interval, while the current >> >> > governor must keep the system out of suspend. >> >> > >> >> Triggering a full suspend (that disregards normal timers) from idle >> >> does not work, since the system may temporarily enter idle while >> >> processing the event. >> > >> > Could you provide an example? >> >> The code takes a pagefault. The system is idle while waiting for the >> dma to finish. > > OK. > >> > In such a case, how can the kernel guarantee that the system will process >> > the event within (HZ / 2) seconds? >> >> It can't, but in my opinion if the event did not start processing in >> 1/2 second it was probably not time critical. > > Thanks for the correction regarding the unit of the time interval. > > To use the pagefault example, wouldn't that be partially dependent on > factors outside the control of the task that handles the input event? > (e.g., the time needed to page back in, or other code running on the > system.) > Yes (we don't need any timeouts to handle input events though). >> >> > Of course, to do this, a different method for freezing processes would be >> >> > needed; one possible approach is described here[1]; others have proposed >> >> > similar approaches. >> >> >> >> If you freeze processes you need something like a suspend blocker to >> >> prevent processes from being frozen. >> > >> > (It occurs to me that "frozen" is the wrong terminology to use. >> > Perhaps "paused" would be a better term, along the lines of [1].) >> > >> > The full set of processes does not need to be paused. For example, if >> > "untrusted" processes are run in a cgroup, and only tasks in that cgroup >> > are put into TASK_INTERRUPTIBLE, then "trusted" processes will not be >> > paused. >> >> We want to freeze trusted processes. > > By "trusted," I mean processes that are run with a suspend block, as in > your example to Tony[1]. Isn't it so that Android has processes that > either run entirely under suspend-blocks, or which take suspend-blocks at > some point in time during their execution? Wouldn't those processes need > to be prevented from a force-pause or force-freeze? > We don't freeze processes while any suspend blockers are active. >> >> By having a process that never gets frozen you could implement this >> >> entirely in user space, but you would have to funnel all wakeup events >> >> though this process. >> > >> > The proposal[2] does require a kernel change. No wakeup funnel process >> > would be needed. The proposal is to put processes into >> > TASK_INTERRUPTIBLE, not TASK_STOPPED (unlike most "freezers"). >> > >> >> Why is the method used for pausing the processes relevant? If the >> process is paused it cannot process the wakeup event. > > Tasks in TASK_INTERRUPTIBLE are woken back up when an event arrives, So you are not actually preventing them from. A task that wakes up 60 times a second will still drain the battery, unless you happened to pause it while it was active. > unlike tasks in TASK_STOPPED, which are stopped until a SIGCONT arrives. > >> The only way to get the wake event to the process is to funnel it though >> a process that never is paused. > > > - Paul > > 1. http://lkml.org/lkml/2010/5/7/420 > > -- Arve Hjønnevåg _______________________________________________ linux-pm mailing list linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/linux-pm