Re: [PATCH 2/2] drm/i915/execlists: Delay writing to ELSP until HW has processed the previous write

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Sat, 18 Nov 2017 16:36:43 +0000

Quoting Chris Wilson (2017-11-18 11:25:32)
> The hardware needs some time to process the information received in the
> ExecList Submission Port, and expects us to don't write anything new until
> it has 'acknowledged' this new execlist by sending an IDLE_ACTIVE or
> PREEMPTED CSB event.
> 
> If we do not follow this, the driver could write new data into the ELSP
> before HW had finishing fetching the previous one, putting us in
> 'undefined behaviour' space.
> 
> This seems to be the problem causing the spurious PREEMPTED & COMPLETE
> events after a COMPLETE like the one below:
> 
> [] vcs0: sw rd pointer = 2, hw wr pointer = 0, current 'head' = 3.
> [] vcs0:  Execlist CSB[0]: 0x00000018 _ 0x00000007
> [] vcs0:  Execlist CSB[1]: 0x00000001 _ 0x00000000
> [] vcs0:  Execlist CSB[2]: 0x00000018 _ 0x00000007  <<< COMPLETE
> [] vcs0:  Execlist CSB[3]: 0x00000012 _ 0x00000007  <<< PREEMPTED & COMPLETE
> [] vcs0:  Execlist CSB[4]: 0x00008002 _ 0x00000006
> [] vcs0:  Execlist CSB[5]: 0x00000014 _ 0x00000006
> 
> The ELSP writes that lead to this CSB sequence show that the HW hadn't
> started executing the previous execlist (the one with only ctx 0x6) by the
> time the new one was submitted; this is a bit more clear in the data
> show in the EXECLIST_STATUS register at the time of the ELSP write.
> 
> [] vcs0: ELSP[0] = 0x0_0        [execlist1] - status_reg = 0x0_302
> [] vcs0: ELSP[1] = 0x6_fedb2119 [execlist0] - status_reg = 0x0_8302
> 
> [] vcs0: ELSP[2] = 0x7_fedaf119 [execlist1] - status_reg = 0x0_8308
> [] vcs0: ELSP[3] = 0x6_fedb2119 [execlist0] - status_reg = 0x7_8308
> 
> Note that having to wait for this ack does not disable lite-restores,
> although it may reduce their numbers.
> 
> v2: Rewrote Michel's patch, his digging and his fix, my spelling.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102035
> Suggested-by: Michel Thierry <michel.thierry@xxxxxxxxx>
> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> Cc: Michel Thierry <michel.thierry@xxxxxxxxx>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c        | 25 +++++++++++++++++++++++--
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
>  2 files changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index c2cfdfdc0722..2694bc40f0fe 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -479,6 +479,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>  
>                 elsp_write(desc, elsp);
>         }
> +       execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
>  }
>  
>  static bool ctx_single_port_submission(const struct i915_gem_context *ctx)
> @@ -531,6 +532,7 @@ static void inject_preempt_context(struct intel_engine_cs *engine)
>                 elsp_write(0, elsp);
>  
>         elsp_write(ce->lrc_desc, elsp);
> +       execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
>  }
>  
>  static void execlists_dequeue(struct intel_engine_cs *engine)
> @@ -577,9 +579,20 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>                  * know the next preemption status we see corresponds
>                  * to this ELSP update.
>                  */
> +               GEM_BUG_ON(!port_count(&port[0]));
>                 if (port_count(&port[0]) > 1)
>                         goto unlock;
>  
> +               /*
> +                * If we write to ELSP a second time before the HW has had
> +                * a chance to respond to the previous write, we can confuse
> +                * the HW and hit "undefined behaviour". After writing to ELSP,
> +                * we must then wait until we see a context-switch event from
> +                * the HW to indicate that it has had a chance to respond.
> +                */
> +               if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_HWACK))
> +                       goto unlock;
> +
>                 if (HAS_LOGICAL_RING_PREEMPTION(engine->i915) &&
>                     rb_entry(rb, struct i915_priolist, node)->priority >
>                     max(last->priotree.priority, 0)) {
> @@ -876,6 +889,15 @@ static void execlists_submission_tasklet(unsigned long data)
>                         if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK))
>                                 continue;
>  
> +                       /*
> +                        * Note that we could acknowledge the first IDLE_ACTIVE
> +                        * event as the HW having processed the first ELSP
> +                        * write, we just choose not to for simplicity. Once
> +                        * a flow is established, lite-restore events provide
> +                        * us with the acks we need.
> +                        */
> +                       execlists_set_active(execlists, EXECLISTS_ACTIVE_HWACK);

On second thought, and thankfully we already have tests to demonstrate
this, we do need to respond to IDLE_ACTIVE to allow preemption on the
first submit.

As to the assert hit during gem_exec_schedule, that is another question.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx