Quoting Chris Wilson (2017-11-18 11:25:32) > The hardware needs some time to process the information received in the > ExecList Submission Port, and expects us to don't write anything new until > it has 'acknowledged' this new execlist by sending an IDLE_ACTIVE or > PREEMPTED CSB event. > > If we do not follow this, the driver could write new data into the ELSP > before HW had finishing fetching the previous one, putting us in > 'undefined behaviour' space. > > This seems to be the problem causing the spurious PREEMPTED & COMPLETE > events after a COMPLETE like the one below: > > [] vcs0: sw rd pointer = 2, hw wr pointer = 0, current 'head' = 3. > [] vcs0: Execlist CSB[0]: 0x00000018 _ 0x00000007 > [] vcs0: Execlist CSB[1]: 0x00000001 _ 0x00000000 > [] vcs0: Execlist CSB[2]: 0x00000018 _ 0x00000007 <<< COMPLETE > [] vcs0: Execlist CSB[3]: 0x00000012 _ 0x00000007 <<< PREEMPTED & COMPLETE > [] vcs0: Execlist CSB[4]: 0x00008002 _ 0x00000006 > [] vcs0: Execlist CSB[5]: 0x00000014 _ 0x00000006 > > The ELSP writes that lead to this CSB sequence show that the HW hadn't > started executing the previous execlist (the one with only ctx 0x6) by the > time the new one was submitted; this is a bit more clear in the data > show in the EXECLIST_STATUS register at the time of the ELSP write. > > [] vcs0: ELSP[0] = 0x0_0 [execlist1] - status_reg = 0x0_302 > [] vcs0: ELSP[1] = 0x6_fedb2119 [execlist0] - status_reg = 0x0_8302 > > [] vcs0: ELSP[2] = 0x7_fedaf119 [execlist1] - status_reg = 0x0_8308 > [] vcs0: ELSP[3] = 0x6_fedb2119 [execlist0] - status_reg = 0x7_8308 > > Note that having to wait for this ack does not disable lite-restores, > although it may reduce their numbers. > > v2: Rewrote Michel's patch, his digging and his fix, my spelling. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102035 > Suggested-by: Michel Thierry <michel.thierry@xxxxxxxxx> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Michel Thierry <michel.thierry@xxxxxxxxx> > --- > drivers/gpu/drm/i915/intel_lrc.c | 25 +++++++++++++++++++++++-- > drivers/gpu/drm/i915/intel_ringbuffer.h | 1 + > 2 files changed, 24 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index c2cfdfdc0722..2694bc40f0fe 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -479,6 +479,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine) > > elsp_write(desc, elsp); > } > + execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK); > } > > static bool ctx_single_port_submission(const struct i915_gem_context *ctx) > @@ -531,6 +532,7 @@ static void inject_preempt_context(struct intel_engine_cs *engine) > elsp_write(0, elsp); > > elsp_write(ce->lrc_desc, elsp); > + execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK); > } > > static void execlists_dequeue(struct intel_engine_cs *engine) > @@ -577,9 +579,20 @@ static void execlists_dequeue(struct intel_engine_cs *engine) > * know the next preemption status we see corresponds > * to this ELSP update. > */ > + GEM_BUG_ON(!port_count(&port[0])); > if (port_count(&port[0]) > 1) > goto unlock; > > + /* > + * If we write to ELSP a second time before the HW has had > + * a chance to respond to the previous write, we can confuse > + * the HW and hit "undefined behaviour". After writing to ELSP, > + * we must then wait until we see a context-switch event from > + * the HW to indicate that it has had a chance to respond. > + */ > + if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_HWACK)) > + goto unlock; > + > if (HAS_LOGICAL_RING_PREEMPTION(engine->i915) && > rb_entry(rb, struct i915_priolist, node)->priority > > max(last->priotree.priority, 0)) { > @@ -876,6 +889,15 @@ static void execlists_submission_tasklet(unsigned long data) > if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK)) > continue; > > + /* > + * Note that we could acknowledge the first IDLE_ACTIVE > + * event as the HW having processed the first ELSP > + * write, we just choose not to for simplicity. Once > + * a flow is established, lite-restore events provide > + * us with the acks we need. > + */ > + execlists_set_active(execlists, EXECLISTS_ACTIVE_HWACK); On second thought, and thankfully we already have tests to demonstrate this, we do need to respond to IDLE_ACTIVE to allow preemption on the first submit. As to the assert hit during gem_exec_schedule, that is another question. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx