On 28/07/2018 17:46, Chris Wilson wrote:
We want to expose the parameters for controlling how long it takes for
us to notice and park the GPU after a stream of requests in order to try
and tune the optimal power-efficiency vs latency of a mostly idle system.
Who do we expect to tweak these and what kind of gains have they reported?
v2: retire_work has two scheduling points, update them both
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
---
drivers/gpu/drm/i915/Kconfig.profile | 23 +++++++++++++++++++++++
drivers/gpu/drm/i915/i915_gem.c | 8 +++++---
drivers/gpu/drm/i915/i915_gem.h | 13 +++++++++++++
3 files changed, 41 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
index 8a230eeb98df..63cb744d920d 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -24,3 +24,26 @@ config DRM_I915_SPIN_REQUEST_CS
spin prior to sleeping again.
May be 0 to disable spinning after being woken.
+
+config DRM_I915_GEM_RETIRE_DELAY
+ int
+ default 1000 # milliseconds
+ help
+ We maintain a background job whose purpose is to keep cleaning up
+ after userspace, and may be the first to spot an idle system. This
"user space" since this is help text?
+ parameter determines the interval between execution of the worker.
+
+ To reduce the number of timer wakeups, large delays (of greater than
+ a second) are rounded to the next walltime second to allow coalescing
Wake ups, wall time or with dashes?
+ of multiple system timers into a single wakeup.
Do you perhaps want to omit the bit about rounding to next wall time
second and just say it may not be exact, so to reserve some internal
freedom?
+
+config DRM_I915_GEM_PARK_DELAY
+ int
+ default 100 # milliseconds
+ help
+ Before parking the engines and the GPU after the final request is
+ retired, we may wait for a small delay to reduce the frequecy of
typo in frequency
+ having to park/unpark and so the latency in executing a new request.
+
+ May be 0 to immediately start parking the engines after the last
+ request.
I'd add so it says "the last request is retired." so there is even more
hint that it is not after the request has actually completed but when we
went to retire it.
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 460f256114f7..5b538a92631d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -192,7 +192,9 @@ void i915_gem_park(struct drm_i915_private *i915)
return;
/* Defer the actual call to __i915_gem_park() to prevent ping-pongs */
- mod_delayed_work(i915->wq, &i915->gt.idle_work, msecs_to_jiffies(100));
+ mod_delayed_work(i915->wq,
+ &i915->gt.idle_work,
+ msecs_to_jiffies(CONFIG_DRM_I915_GEM_PARK_DELAY));
}
void i915_gem_unpark(struct drm_i915_private *i915)
@@ -236,7 +238,7 @@ void i915_gem_unpark(struct drm_i915_private *i915)
queue_delayed_work(i915->wq,
&i915->gt.retire_work,
- round_jiffies_up_relative(HZ));
+ i915_gem_retire_delay());
}
int
@@ -3466,7 +3468,7 @@ i915_gem_retire_work_handler(struct work_struct *work)
if (READ_ONCE(dev_priv->gt.awake))
queue_delayed_work(dev_priv->wq,
&dev_priv->gt.retire_work,
- round_jiffies_up_relative(HZ));
+ i915_gem_retire_delay());
}
static void shrink_caches(struct drm_i915_private *i915)
diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index e46592956872..c6b4971435dc 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -27,6 +27,8 @@
#include <linux/bug.h>
#include <linux/interrupt.h>
+#include <linux/jiffies.h>
+#include <linux/timer.h>
struct drm_i915_private;
@@ -76,6 +78,17 @@ struct drm_i915_private;
void i915_gem_park(struct drm_i915_private *i915);
void i915_gem_unpark(struct drm_i915_private *i915);
+static inline unsigned long i915_gem_retire_delay(void)
+{
+ const unsigned long delay =
+ msecs_to_jiffies(CONFIG_DRM_I915_GEM_RETIRE_DELAY);
+
+ if (CONFIG_DRM_I915_GEM_RETIRE_DELAY >= 1000)
+ return round_jiffies_up_relative(delay);
+ else
+ return delay;
+}
+
static inline void __tasklet_disable_sync_once(struct tasklet_struct *t)
{
if (atomic_inc_return(&t->count) == 1)
Leave it to you to pick or not any of the tidies I suggested. Either way:
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx