[linux-pm] Some thoughts on suspend/resume development

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 8 Mar 2005, Patrick Mochel wrote:

> Ugh. I see there is still disagreement about naming. What type of platform
> uses that naming scheme? I've always been under the impression that STR,
> STD, and Standby were generic names; at least that is what has been stated
> in the code and email for ~4 years..

Certainly they are generic names.  But they don't apply to all sorts of 
systems.  Suspend-To-Disk makes little sense for an embedded system 
without a disk drive, for example.

> > Now it's time to consider how to implement additional power-saving
> > measures -- in other words, selective suspends.
> 
> This has commonly been referred to as "Runtime Power Management" or more
> generally "Device Power Management" (as opposed to "System Power
> Management").

I like "Device Power Management".  "Runtime Power Management" sounds 
redundant; who worries about power management when the computer's not 
running?  ("Runtime-Power Management" would be a little better.)


> > A common problem for all selective suspends is that, unlike system
> > sleeps, they can occur at any time.  Drivers will get very confused
> > unless we can guarantee somehow that, at a minimum, they will not
> > receive a suspend or resume call for a device while its probe or
> > release routine is running.
> 
> That's a good point. In general, a driver should only get suspend/resume
> calls when it's bound to a device, which is technically after ->probe()
> and before ->release(). This means that the interface for controlling
> power state should only be exposed during that time. (Currently the power/
> directory is added when the (physical) device is registered.)
> 
> We can change this by not adding the power/ directory (and associated
> files) until after the driver is bound.

That solves half the problem, but it won't prevent a driver from being
unbound while a suspend or resume call is in progress.

>  But, I think a better solution
> would be for the bus subsystems to add/remove the power control files,
> since it a) knows when the driver is bound/unbound and b) is likely to
> have a bus-specific interface (like the name and number of power states to
> enter).

So in essence you would place the responsibility for mutual exclusion on 
the bus subsystem and not on the PM/driver-model core?

> This would also easily allow the bus to provide a default power interface
> for devices that are not bound to drivers.

A good point that is often overlooked.


> > An important difference between system sleep and selective suspend is
> > that with selective suspend, we generally expect the device to resume
> > on demand.  This demand may take the form of a request to the driver
> > (e.g., a block I/O request for a disk device) or a resume request from
> > the device itself (e.g., a notification from a mouse that has just
> > been moved).  This means that input queues must not be plugged and
> > device interrupts must remain enabled, exactly the opposite of what
> > happens during system sleep.  For this reason it is vital for drivers
> > to know whether a suspend call is invoking a system sleep or a
> > selective suspend.  Hence I propose that a new pm_message_t event code,
> > PMSG_SELECTIVE (or maybe PMSG_SELECTIVE_SUSPEND), be used for selective
> > suspends.
> 
> I +/- agree, though I think there also must be a way to completely suspend
> the device, like when you are doing a system suspend.

In other words, the interface must allow the user to specify both a power
level and whether resume-on-demand is enabled.  The lowest power level
together with no resume-on-demand should have the same effect as a system
sleep.

> > With resume-on-demand implemented properly, a driver may decide that
> > it can suspend its device without bothering to suspend the device's
> > children.  This kind of decision should be left to individual drivers
> > and the PM core shouldn't try to enforce a "children must be suspended
> > before their parents" policy for selective suspends.
> 
> Also true, and even true for system suspend states. While some child
> devices may not support PM, a parent device could, and power down the
> entire bus. It's important that we do descendant-ancestor ordering
> correctly during system suspend transitions. For runtime transitions, we
> need a way for the driver of a parent device to return an error if its
> child devices aren't in a compatible state for it (the parent) to be
> suspended.

Exactly.  We may also need some kind of locking to prevent the child 
devices' states from changing between the time the parent's driver checks 
them and the time the parent is suspended.

> This would be doing something like partial-tree suspends, but I'm not sure
> if this is best done in the kernel or in userspace with a proper tool.

One might have a user interface for "suspend this device and all its
descendants".  But I tend to think this is best left to a userspace tool.

The flip side is to have an interface for "resume this device and all its 
ancestors".  This might turn out to be more useful in the kernel, for 
drivers that want to do an on-demand resume and are stuck because the 
parent device has done an idle-timeout suspend.

> > A common problem facing all drivers that do auto suspend is how to set
> > the inactivity timeout.  Two possible answers are: add an attribute
> > file in the /sys/.../power directory (so different devices can have
> > different timeouts), or add a driver module parameter (so all devices
> > using the same driver will have the same timeout).
> 
> It's trickier than that. You want a per-device parameter that can be
> adjusted. You also want a per-state parameter so that a device can
> gradually enter a deeper and deeper state over time. (You can do it with 1
> timer per device that is set to the timeout value of the next state when
> one fires, but that's an implementation issue).
> 
> So, it's bus-specific because it involves the name and number of physical
> power states. And, it has a driver-specific component that is adjusted
> when the driver is bound to the device. Plus, you also need to make sure,
> in the drivers, that you adjust/modify the proper timer values when you
> enter a specific device state.

I wonder if you aren't adding more complexity than is really needed.  In 
any case, this should be handled at the level of individual bus and device 
drivers.  It would be nice to specify a common interface for controlling 
such things, though...

> This is all screaming for a much more complete bus-specific interface to
> power management. It seems like the driver core can provide some helpers
> and some common interfaces, but since most of the work is bus-specific, it
> should be happening in e.g. PCI and USB..

Our problem is to identify the common mechanisms that can usefully be 
abstracted into the core.


> > For user suspends (made through sysfs) the user may want to convey
> > arbitrary information to a driver, things like which clocks to turn
> > off, which power level to change to, and so on.  This information
> > will vary from driver to driver, and the PM core shouldn't even try to
> > impose any sort of structure on it.  I think the best approach will be
> > to pass to the driver a character pointer giving the data written to
> > /sys/.../power/state, so that users can send whatever they want just
> > by writing it to the file.  This means adding an additional field to
> > pm_message_t.
> 
> Uh, that would really suck. This would entail a string parser in every
> driver, which is what we wanted to get away from with sysfs. A better way
> would be to have a driver export a file with the specific features that it
> supports encoded in a meaningful and efficient way (i.e. a fixed-length
> string, character, or constant).

What I said was consistent with fixed-length strings, characters, or 
constants.  But since the range of possible messages a driver might want 
is open-ended, the PM core shouldn't try to impose its own structure on 
the data.

Alan Stern


[Index of Archives]     [Linux ACPI]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [CPU Freq]     [Kernel Newbies]     [Fedora Kernel]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux