Quoting Tvrtko Ursulin (2017-12-11 14:19:47) > > On 08/12/2017 12:10, Chris Wilson wrote: > > The intent here was that we would be listening to > > i915_gem_request_unsubmit in order to cancel the signaler quickly and > > release the reference on the request. Cancelling the signaler is done > > directly via intel_engine_cancel_signaling (called from unsubmit), but > > that does not directly wake up the signaling thread, and neither does > > setting the request->global_seqno back to zero wake up listeners to the > > request->execute waitqueue. So the only time that listening to the > > request->execute waitqueue would wake up the signaling kthread would be > > on the request resubmission, during which time we would already receive > > wake ups from rejoining the global breadcrumbs wait rbtree. > > > > Trying to wake up to release the request remains an issue. If the > > signaling was cancelled and no other request required signaling, then it > > is possible for us to shutdown with the reference on the request still > > held. To ensure that we do not try to shutdown, leaking that request, we > > kick the signaling threads whenever we disarm the breadcrumbs, i.e. on > > parking the engine when idle. > > > > Fixes: d6a2289d9d6b ("drm/i915: Remove the preempted request from the execution queue") > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> > > Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/intel_breadcrumbs.c | 18 +++++++++--------- > > 1 file changed, 9 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c > > index 5ae2d276f7f3..b826725989b1 100644 > > --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c > > +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c > > @@ -216,7 +216,7 @@ void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine) > > struct intel_wait *wait, *n, *first; > > > > if (!b->irq_armed) > > - return; > > + goto wakeup_signaler; > > > > /* We only disarm the irq when we are idle (all requests completed), > > * so if the bottom-half remains asleep, it missed the request > > @@ -239,6 +239,14 @@ void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine) > > b->waiters = RB_ROOT; > > > > spin_unlock_irq(&b->rb_lock); > > + > > + /* > > + * The signaling thread may be asleep holding a reference to a request, > > + * that had its signaling cancelled prior to being preempted. We need > > + * to kick the signaler, just in case, to release any such reference. > > + */ > > +wakeup_signaler: > > + wake_up_process(b->signaler); One thing I was thinking about from our conversation, is whether we can rely on the kthread_stop() kick for the final ref release? Whether it's even a good idea to let the reference live for that long? Basically it will mean replacing the GEM_BUG_ON(request) there with another unref. But it will remove the complication and assoicated commentary here. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx