On Wed, Feb 01, 2017 at 12:09:40PM +0000, Tvrtko Ursulin wrote: > > On 01/02/2017 10:24, Chris Wilson wrote: > >This emulates execlists on top of the GuC in order to defer submission of > >requests to the hardware. This deferral allows time for high priority > >requests to gazump their way to the head of the queue, however it nerfs > >the GuC by converting it back into a simple execlist (where the CPU has > >to wake up after every request to feed new commands into the GuC). > > Not every request since it does have two virtual ports, or even a > bit better with request merging. Just saying so it sounds less > dramatic for people reading commit messages. We still wakeup after the first virtual port is drained, with the hope that the second is full to prevent the stall. What you meant is that we coalesce requests from the same context into a port, so not every request need result in an extra interrupt. Almost every request. > >diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > >index 0ff75f2282fa..e0cec254faa5 100644 > >--- a/drivers/gpu/drm/i915/i915_irq.c > >+++ b/drivers/gpu/drm/i915/i915_irq.c > >@@ -1350,8 +1350,10 @@ static void snb_gt_irq_handler(struct drm_i915_private *dev_priv, > > static __always_inline void > > gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift) > > { > >- if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) > >+ if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) { > >+ tasklet_schedule(&engine->irq_tasklet); > > notify_ring(engine); > >+ } > > Would this be better: > > if (iir & (CTX | USR)) { > set_bit(); No because ^. > tasklet_hi_schedule(); > > if (iir & USR) { > notify_ring(); > } > } > > ? If we set the context-switch bit too early (and process the user-interrupt and the context-switch as two separate passes through the tasklet), we encounter the error from before that the CSB register may be undefined. Possibly not since the user interrupt should mean the ring is powered up (and so the register restored from the power context, one hopes!), but there is no reason for us to read back the registers until we see the interrupt telling us they have been updated. Note sure the best way to write that for simplicity, so I kept the patch clean. if (!(iir & ((GT_RENDER_USER_INTERRUPT | GT_CONTEXT_SWITCH_INTERRUPT) << test_shift))) return; if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted); tasklet_hi_schedule(&engine->irq_tasklet); if (iir & (GT_RENDER_USER_INTERRUPT << test_shift)) notify_ring(engine); add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-85 (-85) function old new delta gen8_gt_irq_handler 756 671 -85 Total: Before=1093982, After=1093897, chg -0.01% Ok, gcc likes that a lot more. > > if (iir & (GT_CONTEXT_SWITCH_INTERRUPT << test_shift)) { > > set_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted); > >diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > >index 0e99d53d5523..1b7fc0ffa7ad 100644 > >--- a/drivers/gpu/drm/i915/intel_lrc.c > >+++ b/drivers/gpu/drm/i915/intel_lrc.c > >@@ -1272,7 +1272,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine) > > > > /* After a GPU reset, we may have requests to replay */ > > clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted); > >- if (!execlists_elsp_idle(engine)) { > >+ if (!i915.enable_guc_submission && !execlists_elsp_idle(engine)) { > > engine->execlist_port[0].count = 0; > > engine->execlist_port[1].count = 0; > > execlists_submit_ports(engine); > >@@ -1340,9 +1340,6 @@ static void reset_common_ring(struct intel_engine_cs *engine, > > request->ring->last_retired_head = -1; > > intel_ring_update_space(request->ring); > > > >- if (i915.enable_guc_submission) > >- return; > >- > > /* Catch up with any missed context-switch interrupts */ > > I915_WRITE(RING_CONTEXT_STATUS_PTR(engine), _MASKED_FIELD(0xffff, 0)); > > Should we put this write under the i915.enable_guc_submission > perhasp? Maybe it makes no difference since this is after the reset > but don't know, perhaps it is more logical. Perhaps. Although now with the set_bit(CTX_SWITCH), we should be able to remove this line entirely. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx