Re: [PATCH 4/5] drm/i915/gt: Adopt engine_park synchronisation rules for engine_retire

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Thu, 21 Nov 2019 16:17:59 +0000

On 21/11/2019 14:53, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2019-11-21 14:42:56)

On 21/11/2019 13:51, Chris Wilson wrote:
In the next patch, we will introduce a new asynchronous retirement
worker, fed by execlists CS events. Here we may queue a retirement as
soon as a request is submitted to HW (and completes instantly), and we
also want to process that retirement as early as possible and cannot
afford to postpone (as there may not be another opportunity to retire it
for a few seconds). To allow the new async retirer to run in parallel
with our submission, pull the __i915_request_queue (that passes the
request to HW) inside the timelines spinlock so that the retirement
cannot release the timeline before we have completed the submission.

Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
---
   drivers/gpu/drm/i915/gt/intel_engine_pm.c | 29 ++++++++++++++++-------
   1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 373a4b9f159c..bd0af02bea16 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -74,18 +74,33 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
   #endif /* !IS_ENABLED(CONFIG_LOCKDEP) */
   
   static void
-__intel_timeline_enter_and_release_pm(struct intel_timeline *tl,
-                                   struct intel_engine_cs *engine)
+__queue_and_release_pm(struct i915_request *rq,
+                    struct intel_timeline *tl,
+                    struct intel_engine_cs *engine)
   {
       struct intel_gt_timelines *timelines = &engine->gt->timelines;
   
+     /*
+      * We have to serialise all potential retirement paths with our
+      * submission, as we don't want to underflow either the
+      * engine->wakeref.counter or our timeline->active_count.
+      *
+      * Equally, we cannot allow a new submission to start until
+      * after we finish queueing, nor could we allow that submitter
+      * to retire us before we are ready!
+      */
       spin_lock(&timelines->lock);
   
-     if (!atomic_fetch_inc(&tl->active_count))
-             list_add_tail(&tl->link, &timelines->active_list);
+     /* Hand the request over to HW and so engine_retire() */
+     __i915_request_queue(rq, NULL);
   
+     /* Let new submissions commence (and maybe retire this timeline) */
       __intel_wakeref_defer_park(&engine->wakeref);
   
+     /* Let intel_gt_retire_requests() retire us */
+     if (!atomic_fetch_inc(&tl->active_count))
+             list_add_tail(&tl->link, &timelines->active_list);
+
       spin_unlock(&timelines->lock);

Now that everything is under the lock the order of operation is not
important, or it still is?

queue before unpark that is required.

unpark and add_to_timeline, the order is flexible as the lock governors
the interactions between those and retirers. So I chose to allow the
next newcomer start a few instructions earlier.

Yes, because of different locks. So the comment above 
__intel_wakeref_defer_park is not correct since timeline cannot be 
retired until the lock is dropped.

It's only preservation of timeline ordering which mandates defer_park 
after request_queue. As far as I am able to summon my own understanding 
from yesterday.

Regards,

Tvrtko




_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx