Re: [PATCH 1/8] PM: Add suspend block api.

Paul Walmsley <paul@xxxxxxxxx> · Mon, 17 May 2010 21:34:45 -0600 (MDT)

Hello Arve,

On Mon, 17 May 2010, Arve Hjønnevåg wrote:

> On Mon, May 17, 2010 at 7:17 PM, Paul Walmsley <paul@xxxxxxxxx> wrote:
> > On Fri, 14 May 2010, Arve Hjønnevåg wrote:
> >> On Thu, May 13, 2010 at 11:27 PM, Paul Walmsley <paul@xxxxxxxxx> wrote:
> >> > On Thu, 13 May 2010, Arve Hjønnevåg wrote:
> >> >
> >> >> Adds /sys/power/policy that selects the behaviour of /sys/power/state.
> >> >> After setting the policy to opportunistic, writes to /sys/power/state
> >> >> become non-blocking requests that specify which suspend state to enter
> >> >> when no suspend blockers are active. A special state, "on", stops the
> >> >> process by activating the "main" suspend blocker.
> >> >>
> >> >> Signed-off-by: Arve Hjønnevåg <arve@xxxxxxxxxxx>
> >> >
> >> > Looking at the way that suspend-blocks are used in the current Android
> >> > 'msm' kernel tree[1], they seem likely to cause layering violations,
> >> > fragile kernel code with userspace dependencies, and code that consumes
> >> > power unnecessarily.
> >> >
> >> > For example, in drivers/mmc/core/core.c:721, in mmc_rescan() [2], we find
> >> > the following code:
> >> >
> >> >        /* give userspace some time to react */
> >> >        wake_lock_timeout(&mmc_delayed_work_wake_lock, HZ / 2);
> >> >
> >> > This is a layering violation.  The MMC subsystem should have nothing to do
> >> > with "giving userspace time to react."  That is the responsibility of the
> >> > Linux scheduler.
> >> >
> >> > This code is also intrinsically fragile and use-case dependent.  What if
> >> > userspace occasionally needs more than (HZ / 2) to react?  If the
> >> > distributor is lucky enough to catch this before the product ships, then
> >> > the distributor can patch in a new magic number.  But if the device makes
> >> > it to the consumer, the result is an unstable device that unpredictably
> >> > suspends.
> >> >
> >> > The above code will also waste power.  Even if userspace takes less than
> >> > (HZ / 2) to react, the system will still be prevented from entering a
> >> > system-wide low power state for the duration of the remaining time.  This
> >> > is in contrast to an approach that uses the idle loop to enter a
> >> > system-wide low power state.  The moment that the system has no further
> >> > work to do, it can start saving power.
> >> >
> >>
> >> Yes, preventing suspend for 1/2 second wastes some power, but are you
> >> seriously suggesting that it wastes more power then never suspending?
> >> Opportunitic suspend does not prevent entering low power modes from
> >> idle.
> >
> > Sorry, my E-mail was unclear.  Here is my intended point:
> >
> > The current Android opportunistic suspend governor does not enter full
> > system suspend until all suspend-blocks are released.  If the governor
> > were modified to respect the scheduler, then it could enter suspend the
> > moment that the system had no further work to do.
> >
> 
> On the hardware that shipped we enter the same power state from idle
> and suspend, so the only power savings we get from suspend that we
> don't get in idle is from not respecting the scheduler and timers.

Other people 
say that their hardware XXX.  (This is probably likely for anything that 
supports ACPI.)

> > So during part of the (HZ / 2) time interval in the above code, the system
> > is running some kernel or userspace code.  But after that work is done, an
> > idle-loop-based opportunistic suspend governor could enter idle during
> > the remaining portion of that (HZ / 2) time interval, while the current
> > governor must keep the system out of suspend.
> >
> Triggering a full suspend (that disregards normal timers) from idle
> does not work, since the system may temporarily enter idle while
> processing the event.

Could you provide an example?

In such a case, how can the kernel guarantee that the system will process 
the event within (HZ / 2) seconds?

> > Of course, to do this, a different method for freezing processes would be
> > needed; one possible approach is described here[1]; others have proposed
> > similar approaches.
> 
> If you freeze processes you need something like a suspend blocker to
> prevent processes from being frozen. 

(It occurs to me that "frozen" is the wrong terminology to use.
 Perhaps "paused" would be a better term, along the lines of [1].)

The full set of processes does not need to be paused.  For example, if 
"untrusted" processes are run in a cgroup, and only tasks in that cgroup 
are put into TASK_INTERRUPTIBLE, then "trusted" processes will not be 
paused.

> By having a process that never gets frozen you could implement this 
> entirely in user space, but you would have to funnel all wakeup events 
> though this process.

The proposal[2] does require a kernel change.  No wakeup funnel process 
would be needed.  The proposal is to put processes into 
TASK_INTERRUPTIBLE, not TASK_STOPPED (unlike most "freezers").

- Paul

1. Paragraph B4B of Paul Walmsley E-mail to the linux-pm mailing 
   list, dated Mon May 17 18:39:44 PDT 2010:
   https://lists.linux-foundation.org/pipermail/linux-pm/2010-May/025663.html

2. ibid.

_______________________________________________
linux-pm mailing list
linux-pm@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linux-foundation.org/mailman/listinfo/linux-pm