[linux-pm] Some thoughts on suspend/resume development

mochel at digitalimplant.org (Patrick Mochel) · Tue Mar 8 09:13:13 2005

On Sat, 5 Mar 2005, Alan Stern wrote:

> Up to now, the PM development effort has been concerned primarily with
> system-wide sleep transitions, things like Suspend-To-RAM (STR) and
> Suspend-To-Disk (STD).  (A more general, less PC-centric description
> would call these states "deep sleep" and "shallow sleep".  A third
> possible state, which some people might be in favor of, is Standby or
> "very shallow sleep".)

Ugh. I see there is still disagreement about naming. What type of platform
uses that naming scheme? I've always been under the impression that STR,
STD, and Standby were generic names; at least that is what has been stated
in the code and email for ~4 years..

> Now it's time to consider how to implement additional power-saving
> measures -- in other words, selective suspends.

This has commonly been referred to as "Runtime Power Management" or more
generally "Device Power Management" (as opposed to "System Power
Management").

> A common problem for all selective suspends is that, unlike system
> sleeps, they can occur at any time.  Drivers will get very confused
> unless we can guarantee somehow that, at a minimum, they will not
> receive a suspend or resume call for a device while its probe or
> release routine is running.

That's a good point. In general, a driver should only get suspend/resume
calls when it's bound to a device, which is technically after ->probe()
and before ->release(). This means that the interface for controlling
power state should only be exposed during that time. (Currently the power/
directory is added when the (physical) device is registered.)

We can change this by not adding the power/ directory (and associated
files) until after the driver is bound. But, I think a better solution
would be for the bus subsystems to add/remove the power control files,
since it a) knows when the driver is bound/unbound and b) is likely to
have a bus-specific interface (like the name and number of power states to
enter).

This would also easily allow the bus to provide a default power interface
for devices that are not bound to drivers.

> An important difference between system sleep and selective suspend is
> that with selective suspend, we generally expect the device to resume
> on demand.  This demand may take the form of a request to the driver
> (e.g., a block I/O request for a disk device) or a resume request from
> the device itself (e.g., a notification from a mouse that has just
> been moved).  This means that input queues must not be plugged and
> device interrupts must remain enabled, exactly the opposite of what
> happens during system sleep.  For this reason it is vital for drivers
> to know whether a suspend call is invoking a system sleep or a
> selective suspend.  Hence I propose that a new pm_message_t event code,
> PMSG_SELECTIVE (or maybe PMSG_SELECTIVE_SUSPEND), be used for selective
> suspends.

I +/- agree, though I think there also must be a way to completely suspend
the device, like when you are doing a system suspend.

> With resume-on-demand implemented properly, a driver may decide that
> it can suspend its device without bothering to suspend the device's
> children.  This kind of decision should be left to individual drivers
> and the PM core shouldn't try to enforce a "children must be suspended
> before their parents" policy for selective suspends.

Also true, and even true for system suspend states. While some child
devices may not support PM, a parent device could, and power down the
entire bus. It's important that we do descendant-ancestor ordering
correctly during system suspend transitions. For runtime transitions, we
need a way for the driver of a parent device to return an error if its
child devices aren't in a compatible state for it (the parent) to be
suspended.

This would be doing something like partial-tree suspends, but I'm not sure
if this is best done in the kernel or in userspace with a proper tool.

> A common problem facing all drivers that do auto suspend is how to set
> the inactivity timeout.  Two possible answers are: add an attribute
> file in the /sys/.../power directory (so different devices can have
> different timeouts), or add a driver module parameter (so all devices
> using the same driver will have the same timeout).

It's trickier than that. You want a per-device parameter that can be
adjusted. You also want a per-state parameter so that a device can
gradually enter a deeper and deeper state over time. (You can do it with 1
timer per device that is set to the timeout value of the next state when
one fires, but that's an implementation issue).

So, it's bus-specific because it involves the name and number of physical
power states. And, it has a driver-specific component that is adjusted
when the driver is bound to the device. Plus, you also need to make sure,
in the drivers, that you adjust/modify the proper timer values when you
enter a specific device state.

This is all screaming for a much more complete bus-specific interface to
power management. It seems like the driver core can provide some helpers
and some common interfaces, but since most of the work is bus-specific, it
should be happening in e.g. PCI and USB..

> For user suspends (made through sysfs) the user may want to convey
> arbitrary information to a driver, things like which clocks to turn
> off, which power level to change to, and so on.  This information
> will vary from driver to driver, and the PM core shouldn't even try to
> impose any sort of structure on it.  I think the best approach will be
> to pass to the driver a character pointer giving the data written to
> /sys/.../power/state, so that users can send whatever they want just
> by writing it to the file.  This means adding an additional field to
> pm_message_t.

Uh, that would really suck. This would entail a string parser in every
driver, which is what we wanted to get away from with sysfs. A better way
would be to have a driver export a file with the specific features that it
supports encoded in a meaningful and efficient way (i.e. a fixed-length
string, character, or constant).

	Pat