On Tue, 8 Mar 2005, Adam Belay wrote: > I'm concerned that > updating the time after every read or access could have a negative impact on > performance. It would be nice if we could discuss how this might be > implemented. It would make more sense to start with ->open() and ->close(), since those imply when the device is being used. For drivers that want to do automatic Runtime PM while the device is opened, we'll just leave it up to them to cook up their own scheme, since it seems like too much of a special case. > I'm not sure if I agree that a parent can be suspended without first > suspending its children. In general, a parent device can be lowered in > power only if the context and operation of the child devices are > maintained. If the change in state does not affect the operation of > child devices, then it really isn't a "suspend". What if there are N PCI devices on a subordinate PCI bus that are not bound to any drivers, and effectively unused, and their controlling bridge supports PM. We won't be able to power down the individual devices without a driver, but assuming we have some sort of Bridge Driver, we could shut down the entire bus. In general, we will always want to try to power down children, but even if we can't, we still want to try to power down the parents. They just have to know when it's safe to do it, which requires some extra checking, but recall that parent devices are less common than leaf devices and the checks can be done for an entire class of devices (like all PCI-PCI bridges) and re-used for each instance. All in all, though, this is a corner case, and not neceesary in the immediate future. > > This would be doing something like partial-tree suspends, but I'm not sure > > if this is best done in the kernel or in userspace with a proper tool. > > The basic strategy would be to lower the state of each child device, and > then when all children are in a lower state, lower the parent (the power > domain) to the least common denominator power state. I think this would > have to be done in the kernel because userspace may not be able to > operate during this transition. Why not? > > > A common problem facing all drivers that do auto suspend is how to set > > > the inactivity timeout. Two possible answers are: add an attribute > > > file in the /sys/.../power directory (so different devices can have > > > different timeouts), or add a driver module parameter (so all devices > > > using the same driver will have the same timeout). > > Each class could have its own policies, complete with timeout values. In > sysfs, the user could select which policy should be used. (e.g. performance, > normal, powersave). I would think it would be better to have a per-device policy that is perhaps derived from the bus. You want a different timeout value for each device state, which is bus-specific. The concepts of 'performance', 'powersave', and 'normal' are policy decisions that are best left for the UI. The only thing the kernel should care about are the specific values of the timers. What each threshold is, is system-specific and should be able to be adjusted. It should also be a requirement of the UI that it apply the same policy (performanc, normal, powesave, etc) to all devices in a specific class. Hm, there seems to be some strict requirements for the UI popping up.. > Another concern I had is how to relate power states between devices. The > most standard format seems to be D*, as it is used by PCI, ACPI, and others. > It's not uncommon for a child device to require the parent to be in a > given state for wake events etc. If the child isn't using the same names > for power states, then how could this be possible? Also how could a class > level policy interact with devices that use different state names? I may > be in favor of only using D-states. We can't. Those are for PCI and ACPI only. Another bus type might define D-states as well, with different semantics (like D3 being a low power state, but not 'off'). For a parent device, remember that those are rare compared to the majority of other device they are bridge devices that understand the semantics of devices on the other side of them. It is up to them to decide if/when they should power down, and how much. > Finally, I'm not sure if I like the current "*probe", "*remove", "*suspend", > and "*resume" for runtime power management. I think it may be better to do > something like the following: <snip> I agree; we've talked about this before. :) But, it's fodder for a separate (albeit related) discussion. Pat