Hi Ulf,
Thanks for taking time to review this.
On Wed, Oct 14 2020 at 04:29 -0600, Ulf Hansson wrote:
On Tue, 13 Oct 2020 at 00:34, Lina Iyer <ilina@xxxxxxxxxxxxxx> wrote:
Some devices have a predictable interrupt pattern while executing a
particular usecase. An example would be the VSYNC interrupt on devices
associated with displays. A 60 Hz display could cause a periodic
interrupt every 16 ms. A PM domain that holds such a device could power
off and on at similar intervals.
Entering a domain idle state saves power, only if the domain remains in
the idle state for the amount of time greater than or equal to the
residency of the idle state. Knowing the next wakeup event of the device
will help the PM domain governor make better idle state decisions.
Let's add the pm_runtime_set_next_wake() API for the device and document
the usage of the API.
Signed-off-by: Lina Iyer <ilina@xxxxxxxxxxxxxx>
---
Documentation/power/runtime_pm.rst | 21 ++++++++++++++++++++
drivers/base/power/runtime.c | 31 ++++++++++++++++++++++++++++++
include/linux/pm.h | 2 ++
include/linux/pm_runtime.h | 1 +
4 files changed, 55 insertions(+)
diff --git a/Documentation/power/runtime_pm.rst b/Documentation/power/runtime_pm.rst
index 0553008b6279..90a5ac481ad4 100644
--- a/Documentation/power/runtime_pm.rst
+++ b/Documentation/power/runtime_pm.rst
@@ -515,6 +515,14 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:
power.use_autosuspend isn't set, otherwise returns the expiration time
in jiffies
+ `int pm_runtime_set_next_event(struct device *dev, ktime_t next);`
Rather than specifying the next event, could it make sense to specify
the delta instead? I guess it depends on the behaviour of the
driver/client that calls this API...
I guess, drivers would calculate the next event from an interval, but
the usage of this feature needs to account for the time from when this
call was received. I am open to taking in interval as the input and
saving the next actual time (adding the current ktime) to it.
+ - notify runtime PM of the next event on the device. Devices that are
I would prefer to change from "notify" to "inform", just to make it
clear that this isn't a notification mechanism we are talking about.
Okay.
5. Runtime PM Initialization, Device Probing and Removal
========================================================
@@ -639,6 +648,18 @@ suspend routine). It may be necessary to resume the device and suspend it again
in order to do so. The same is true if the driver uses different power levels
or other settings for runtime suspend and system sleep.
+When a device enters idle at runtime, it may trigger the runtime PM up the
+hierarchy. Devices that have an predictable interrupt pattern, may help
+influence a better idle state determination of its parent. For example, a
+display device could get a VSYNC interrupt every 16ms. A PM domain containing
+the device, could also be entering and exiting idle due to runtime PM
/containing the device/that has the device attached to it
+coordination. If the domain were also entering runtime idle, we would know when
+the domain would be waken up as a result of the display device waking up. Using
+the device's next_event, the PM domain governor can make a better choice of the
+idle state for the domain, knowing it would be be woken up by the device in the
+near future. This is specially useful when the device is sensitive to its PM
+domain's idle state enter and exit latencies.
The above sounds a little hand wavy, can you try to be a little more exact?
I can try and rephrase this. But what I think I should be saying is that
if the domain has multiple devices and if some devices are sensitive to
the exit latency of the domain idle, then knowing the next wakeup would
help the governor make better domain idle state decision.
Perhaps, rather than just saying "sensitive to it's PM domain's idle
state..", how about explaining that by using the "next event" the
governor is able to select a more optimal domain idle state, thus we
should avoid wasting energy and better conform to QoS latency
constraints.
QoS is not what we are trying to conform to. We are trying to provide
residency information to the domain to help it make better choice. Just
like we use the CPU's next wakeup in the cluster domain governor.
+
During system resume, the simplest approach is to bring all devices back to full
power, even if they had been suspended before the system suspend began. There
are several reasons for this, including:
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 8143210a5c54..53c2b3d962bc 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -122,6 +122,33 @@ u64 pm_runtime_suspended_time(struct device *dev)
}
EXPORT_SYMBOL_GPL(pm_runtime_suspended_time);
+/**
+ * pm_runtime_set_next_wakeup_event - Notify PM framework of an impending event.
+ * @dev: Device to handle
+ * @next: impending interrupt/wakeup for the device
At what typical points do you expect this function to be called?
Most likely from at the start of the usecase and periodically when the
interrupt/work is being handled. I would think this change to a
different periodicity when the usecase parameters changes.
+ */
+int pm_runtime_set_next_event(struct device *dev, ktime_t next)
+{
+ unsigned long flags;
+ int ret = -EINVAL;
+
+ /*
+ * Note the next pending wakeup of a device,
+ * if the device does not have runtime PM enabled.
+ */
/s/Note/Store
Do you really need to check if runtime PM is enabled? Does it matter?
Hmm.. This has no meaning without runtime PM. Any reason why we don't
need the check? I am okay to removing the check.
+ spin_lock_irqsave(&dev->power.lock, flags);
+ if (!dev->power.disable_depth) {
+ if (ktime_before(ktime_get(), next)) {
+ dev->power.next_event = next;
+ ret = 0;
+ }
+ }
+ spin_unlock_irqrestore(&dev->power.lock, flags);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(pm_runtime_set_next_event);
+
/**
* pm_runtime_deactivate_timer - Deactivate given device's suspend timer.
* @dev: Device to handle.
@@ -1380,6 +1407,9 @@ void __pm_runtime_disable(struct device *dev, bool check_resume)
/* Update time accounting before disabling PM-runtime. */
update_pm_runtime_accounting(dev);
+ /* Reset the next wakeup for the device */
+ dev->power.next_event = KTIME_MAX;
+
I am not sure I get the purpose of this, can you elaborate?
I was trying to make sure that we clean up any next_events when we
disable runtime PM. But your following point negates the need.
I am thinking that the genpd governor doesn't allow to power off of
the PM domain, unless all devices that are attached to it are runtime
PM enabled and runtime PM suspended (see pm_runtime_suspended). That
said, it looks like the above isn't needed? No?
Makes sense.
Perhaps it's better to use pm_runtime_enable() as the point of
resetting the dev->power.next_event?
Okay.
Thanks,
Lina