On Fri, Dec 08, 2017 at 12:10:33PM +0000, Chris Wilson wrote: > The intent here was that we would be listening to > i915_gem_request_unsubmit in order to cancel the signaler quickly and > release the reference on the request. Cancelling the signaler is done > directly via intel_engine_cancel_signaling (called from unsubmit), but > that does not directly wake up the signaling thread, and neither does > setting the request->global_seqno back to zero wake up listeners to the > request->execute waitqueue. So the only time that listening to the > request->execute waitqueue would wake up the signaling kthread would be > on the request resubmission, during which time we would already receive > wake ups from rejoining the global breadcrumbs wait rbtree. > > Trying to wake up to release the request remains an issue. If the > signaling was cancelled and no other request required signaling, then it > is possible for us to shutdown with the reference on the request still > held. To ensure that we do not try to shutdown, leaking that request, we > kick the signaling threads whenever we disarm the breadcrumbs, i.e. on > parking the engine when idle. > > Fixes: d6a2289d9d6b ("drm/i915: Remove the preempted request from the execution queue") > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> > Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx> I don't really understand the details, but with this patch the signaler kthread looks like it follows a race-free pattern again: - tsk->state = UNITERRUPTIBLE - implicitly added to the wait key - since we always use wake_up_process b->signaller that step isn't skipped] - check the condition we're waiting on - schedule() Acked-by: Daniel Vetter <daniel.vetter@xxxxxxxx> But probably better some real experts also pour over this :-) -Daniel > --- > drivers/gpu/drm/i915/intel_breadcrumbs.c | 18 +++++++++--------- > 1 file changed, 9 insertions(+), 9 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c > index 5ae2d276f7f3..b826725989b1 100644 > --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c > +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c > @@ -216,7 +216,7 @@ void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine) > struct intel_wait *wait, *n, *first; > > if (!b->irq_armed) > - return; > + goto wakeup_signaler; > > /* We only disarm the irq when we are idle (all requests completed), > * so if the bottom-half remains asleep, it missed the request > @@ -239,6 +239,14 @@ void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine) > b->waiters = RB_ROOT; > > spin_unlock_irq(&b->rb_lock); > + > + /* > + * The signaling thread may be asleep holding a reference to a request, > + * that had its signaling cancelled prior to being preempted. We need > + * to kick the signaler, just in case, to release any such reference. > + */ > +wakeup_signaler: > + wake_up_process(b->signaler); > } > > static bool use_fake_irq(const struct intel_breadcrumbs *b) > @@ -683,8 +691,6 @@ static int intel_breadcrumbs_signaler(void *arg) > } > > if (unlikely(do_schedule)) { > - DEFINE_WAIT(exec); > - > if (kthread_should_park()) > kthread_parkme(); > > @@ -693,13 +699,7 @@ static int intel_breadcrumbs_signaler(void *arg) > break; > } > > - if (request) > - add_wait_queue(&request->execute, &exec); > - > schedule(); > - > - if (request) > - remove_wait_queue(&request->execute, &exec); > } > i915_gem_request_put(request); > } while (1); > -- > 2.15.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx