Re: [PATCH 03/11] drm/i915/execlists: Suppress redundant preemption

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Fri, 1 Mar 2019 15:07:58 +0000

On 01/03/2019 11:36, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2019-03-01 11:31:26)

ping on below

On 28/02/2019 13:11, Tvrtko Ursulin wrote:

On 26/02/2019 10:23, Chris Wilson wrote:
On unwinding the active request we give it a small (limited to internal
priority levels) boost to prevent it from being gazumped a second time.
However, this means that it can be promoted to above the request that
triggered the preemption request, causing a preempt-to-idle cycle for no
change. We can avoid this if we take the boost into account when
checking if the preemption request is valid.

v2: After preemption the active request will be after the preemptee if
they end up with equal priority.

v3: Tvrtko pointed out that this, the existing logic, makes
I915_PRIORITY_WAIT non-preemptible. Document this interesting quirk!

v4: Prove Tvrtko was right about WAIT being non-preemptible and test it.
v5: Except not all priorities were made equal, and the WAIT not
preempting
is only if we start off as !NEWCLIENT.

Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
---
   drivers/gpu/drm/i915/intel_lrc.c | 38 ++++++++++++++++++++++++++++----
   1 file changed, 34 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c
b/drivers/gpu/drm/i915/intel_lrc.c
index 0e20f3bc8210..dba19baf6808 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -164,6 +164,8 @@
   #define WA_TAIL_DWORDS 2
   #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
+#define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT)
+
   static int execlists_context_deferred_alloc(struct i915_gem_context
*ctx,
                           struct intel_engine_cs *engine,
                           struct intel_context *ce);
@@ -190,8 +192,30 @@ static inline int rq_prio(const struct
i915_request *rq)
   static int effective_prio(const struct i915_request *rq)
   {
+    int prio = rq_prio(rq);
+
+    /*
+     * On unwinding the active request, we give it a priority bump
+     * equivalent to a freshly submitted request. This protects it from
+     * being gazumped again, but it would be preferable if we didn't
+     * let it be gazumped in the first place!
+     *
+     * See __unwind_incomplete_requests()
+     */
+    if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(rq)) {
+        /*
+         * After preemption, we insert the active request at the
+         * end of the new priority level. This means that we will be
+         * _lower_ priority than the preemptee all things equal (and
+         * so the preemption is valid), so adjust our comparison
+         * accordingly.
+         */
+        prio |= ACTIVE_PRIORITY;
+        prio--;
+    }
+
       /* Restrict mere WAIT boosts from triggering preemption */
-    return rq_prio(rq) | __NO_PREEMPTION;
+    return prio | __NO_PREEMPTION;
   }
   static int queue_prio(const struct intel_engine_execlists *execlists)
@@ -359,7 +383,7 @@ __unwind_incomplete_requests(struct
intel_engine_cs *engine)
   {
       struct i915_request *rq, *rn, *active = NULL;
       struct list_head *uninitialized_var(pl);
-    int prio = I915_PRIORITY_INVALID | I915_PRIORITY_NEWCLIENT;
+    int prio = I915_PRIORITY_INVALID | ACTIVE_PRIORITY;
       lockdep_assert_held(&engine->timeline.lock);
@@ -390,9 +414,15 @@ __unwind_incomplete_requests(struct
intel_engine_cs *engine)
        * The active request is now effectively the start of a new client
        * stream, so give it the equivalent small priority bump to prevent
        * it being gazumped a second time by another peer.
+     *
+     * One consequence of this preemption boost is that we may jump
+     * over lesser priorities (such as I915_PRIORITY_WAIT), effectively
+     * making those priorities non-preemptible. They will be moved
forward

After the previous patch wait priority is non-preemptible by definition
making this suggestion preemption boost is making it so not accurate.

+     * in the priority queue, but they will not gain immediate access to
+     * the GPU.
        */
-    if (!(prio & I915_PRIORITY_NEWCLIENT)) {
-        prio |= I915_PRIORITY_NEWCLIENT;
+    if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(active)) {

What is the importance of the has_started check? Hasn't the active
request been running by definition?

No. Semaphores. This is all about defending against incorrect promotion
while a request is still spinning on its dependencies (or else we get
promoted above them and PI is broken).

Is init_breadcrumb after the semaphore, ie. __i915_request_has_started 
will be false while spinning on the semaphore. That possibly makes 
sense.. But you know what I'll say next. It is extremely subtle and 
sprinkled over the code so here we definitely need a comment explaining it.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx