Re: [PATCH v2] drm/i915: Stop listening to request resubmission from the signaler kthread

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 11/12/2017 14:26, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2017-12-11 14:19:47)

On 08/12/2017 12:10, Chris Wilson wrote:
The intent here was that we would be listening to
i915_gem_request_unsubmit in order to cancel the signaler quickly and
release the reference on the request. Cancelling the signaler is done
directly via intel_engine_cancel_signaling (called from unsubmit), but
that does not directly wake up the signaling thread, and neither does
setting the request->global_seqno back to zero wake up listeners to the
request->execute waitqueue. So the only time that listening to the
request->execute waitqueue would wake up the signaling kthread would be
on the request resubmission, during which time we would already receive
wake ups from rejoining the global breadcrumbs wait rbtree.

Trying to wake up to release the request remains an issue. If the
signaling was cancelled and no other request required signaling, then it
is possible for us to shutdown with the reference on the request still
held. To ensure that we do not try to shutdown, leaking that request, we
kick the signaling threads whenever we disarm the breadcrumbs, i.e. on
parking the engine when idle.

Fixes: d6a2289d9d6b ("drm/i915: Remove the preempted request from the execution queue")
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx>
Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx>
---
   drivers/gpu/drm/i915/intel_breadcrumbs.c | 18 +++++++++---------
   1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index 5ae2d276f7f3..b826725989b1 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -216,7 +216,7 @@ void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine)
       struct intel_wait *wait, *n, *first;
if (!b->irq_armed)
-             return;
+             goto wakeup_signaler;
/* We only disarm the irq when we are idle (all requests completed),
        * so if the bottom-half remains asleep, it missed the request
@@ -239,6 +239,14 @@ void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine)
       b->waiters = RB_ROOT;
spin_unlock_irq(&b->rb_lock);
+
+     /*
+      * The signaling thread may be asleep holding a reference to a request,
+      * that had its signaling cancelled prior to being preempted. We need
+      * to kick the signaler, just in case, to release any such reference.
+      */
+wakeup_signaler:
+     wake_up_process(b->signaler);

One thing I was thinking about from our conversation, is whether we can
rely on the kthread_stop() kick for the final ref release? Whether it's
even a good idea to let the reference live for that long?

Probably not, feels awkward. In that sense previous version feels better.

Basically it will mean replacing the GEM_BUG_ON(request) there with
another unref. But it will remove the complication and assoicated
commentary here.

Hm, and is there a test which was able to hit the GEM_BUG_ON? It needed to go in the previous version as well, no?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux