On Fri, Jun 03, 2016 at 05:08:34PM +0100, Chris Wilson wrote: > We can forgo queuing the hangcheck from the start of every request to > until we wait upon a request. This reduces the overhead of every > request, but may increase the latency of detecting a hang. Howeever, if > nothing every waits upon a hang, did it ever hang? It also improves the > robustness of the wait-request by ensuring that the hangchecker is > indeed running before we sleep indefinitely (and thereby ensuring that > we never actually sleep forever waiting for a dead GPU). > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> I think this will run into TDR patches, where we want a super-low-latency hangcheck in some cases. But then I think that's implemented by wrapping the batch in some special cs commands to insta-kill the engine if the timeout expired, so probably not a big problem. Still worth it to double-check with Mika I'd say. -Daniel > --- > drivers/gpu/drm/i915/i915_gem.c | 9 +++++---- > drivers/gpu/drm/i915/i915_irq.c | 10 ++++------ > 2 files changed, 9 insertions(+), 10 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index c7a67a7412cd..03256f096ab6 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -1310,6 +1310,9 @@ int __i915_wait_request(struct drm_i915_gem_request *req, > break; > } > > + /* Ensure that even if the GPU hangs, we get woken up. */ > + i915_queue_hangcheck(dev_priv); > + > timer.function = NULL; > if (timeout || missed_irq(dev_priv, engine)) { > unsigned long expire; > @@ -2674,8 +2677,6 @@ void __i915_add_request(struct drm_i915_gem_request *request, > /* Not allowed to fail! */ > WARN(ret, "emit|add_request failed: %d!\n", ret); > > - i915_queue_hangcheck(engine->i915); > - > queue_delayed_work(dev_priv->wq, > &dev_priv->mm.retire_work, > round_jiffies_up_relative(HZ)); > @@ -3019,8 +3020,8 @@ i915_gem_retire_requests(struct drm_i915_private *dev_priv) > > if (idle) > mod_delayed_work(dev_priv->wq, > - &dev_priv->mm.idle_work, > - msecs_to_jiffies(100)); > + &dev_priv->mm.idle_work, > + msecs_to_jiffies(100)); > > return idle; > } > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index 5c7378374ae6..1303d7c034d3 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -3134,10 +3134,10 @@ static void i915_hangcheck_elapsed(struct work_struct *work) > intel_uncore_arm_unclaimed_mmio_detection(dev_priv); > > for_each_engine_id(engine, dev_priv, id) { > + bool busy = waitqueue_active(&engine->irq_queue); > u64 acthd; > u32 seqno; > unsigned user_interrupts; > - bool busy = true; > > semaphore_clear_deadlocks(dev_priv); > > @@ -3160,12 +3160,11 @@ static void i915_hangcheck_elapsed(struct work_struct *work) > if (engine->hangcheck.seqno == seqno) { > if (ring_idle(engine, seqno)) { > engine->hangcheck.action = HANGCHECK_IDLE; > - if (waitqueue_active(&engine->irq_queue)) { > + if (busy) { > /* Safeguard against driver failure */ > user_interrupts = kick_waiters(engine); > engine->hangcheck.score += BUSY; > - } else > - busy = false; > + } > } else { > /* We always increment the hangcheck score > * if the ring is busy and still processing > @@ -3239,9 +3238,8 @@ static void i915_hangcheck_elapsed(struct work_struct *work) > goto out; > } > > + /* Reset timer in case GPU hangs without another request being added */ > if (busy_count) > - /* Reset timer case chip hangs without another request > - * being added */ > i915_queue_hangcheck(dev_priv); > > out: > -- > 2.8.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx