Re: [PATCH] drm/i915/execlists: Preempt-to-busy

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Thu, 20 Jun 2019 14:01:59 +0100

Quoting Mika Kuoppala (2019-06-20 13:41:26)
> Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes:
> > @@ -38,6 +39,10 @@ struct intel_context {
> >       struct i915_gem_context *gem_context;
> >       struct intel_engine_cs *engine;
> >       struct intel_engine_cs *inflight;
> > +#define intel_context_inflight(ce) ptr_mask_bits((ce)->inflight, 2)
> > +#define intel_context_inflight_count(ce)  ptr_unmask_bits((ce)->inflight, 2)
> > +#define intel_context_inflight_inc(ce) ptr_count_inc(&(ce)->inflight)
> > +#define intel_context_inflight_dec(ce) ptr_count_dec(&(ce)->inflight)
> 
> Just curious here that what you consider the advantages of carrying
> this info with the pointer?

Packing. I just need a bit to track status, and one for overflow.

> > +static inline u32 intel_hws_preempt_address(struct intel_engine_cs *engine)
> > +{
> > +     return (i915_ggtt_offset(engine->status_page.vma) +
> > +             I915_GEM_HWS_PREEMPT_ADDR);
> > +}
> > +
> > +#define ring_pause(E) ((E)->status_page.addr[I915_GEM_HWS_PREEMPT])
> 
> Scary. Please lets make a function of ring_pause and use
> intel_write_status_page in it.

I'd rather not unless you do __intel_write_state_page.

> So I guess you have and you want squeeze the latency fruit.
> 
> When we have everything in place, CI is green and
> everyone is happy, then we tear it down?

Been there, done that.

> > @@ -442,13 +443,11 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
> >               struct intel_engine_cs *owner;
> >  
> >               if (i915_request_completed(rq))
> > -                     break;
> > +                     continue; /* XXX */
> 
> Yeah, but what is the plan with the XXX.

Mulling over tracking context not requests. We still end up with having
to scan history within a context, so not yet seeing anything to
encourage me to make the change. I worry about long request queues
causing preemption latency, as this list is currently only trimmed in
retirement.

One idea in the background is for a scheduler (contemplating something
like the isosynchronous MuQSS) and that might call for a change to
using contexts as the primary, with requests within the contexts.

> > @@ -1223,68 +1217,37 @@ static void process_csb(struct intel_engine_cs *engine)
> >                * status notifier.
> >                */
> >  
> > -             GEM_TRACE("%s csb[%d]: status=0x%08x:0x%08x, active=0x%x\n",
> > +             GEM_TRACE("%s csb[%d]: status=0x%08x:0x%08x\n",
> >                         engine->name, head,
> > -                       buf[2 * head + 0], buf[2 * head + 1],
> > -                       execlists->active);
> > +                       buf[2 * head + 0], buf[2 * head + 1]);
> >  
> >               status = buf[2 * head];
> > -             if (status & (GEN8_CTX_STATUS_IDLE_ACTIVE |
> > -                           GEN8_CTX_STATUS_PREEMPTED))
> > -                     execlists_set_active(execlists,
> > -                                          EXECLISTS_ACTIVE_HWACK);
> > -             if (status & GEN8_CTX_STATUS_ACTIVE_IDLE)
> > -                     execlists_clear_active(execlists,
> > -                                            EXECLISTS_ACTIVE_HWACK);
> > -
> > -             if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK))
> > -                     continue;
> > +             if (status & GEN8_CTX_STATUS_IDLE_ACTIVE) {
> > +promote:
> > +                     GEM_BUG_ON(!assert_pending_valid(execlists, "promote"));
> > +                     execlists->active =
> > +                             memcpy(execlists->inflight,
> > +                                    execlists->pending,
> > +                                    execlists_num_ports(execlists) *
> > +                                    sizeof(*execlists->pending));
> > +                     execlists->pending[0] = NULL;
> 
> I can't decide if comment or a helper inline function would
> serve better as documentation of between inflight and pending
> movement.

The magic is just this function, I think process_csb() reads quite
nicely with the 3 branches and switching between different states. It's
about 8 lines without the comments and asserts.

> I guess it is better to be left as a future work after
> the dust settles.
> 
> Just general yearning for a similar kind of level of documentation
> steps as in dequeue.
> 
> >  
> > -             /* We should never get a COMPLETED | IDLE_ACTIVE! */
> > -             GEM_BUG_ON(status & GEN8_CTX_STATUS_IDLE_ACTIVE);
> 
> Is our assert coverage going to suffer?

You've looked at the added asserts and tracing; I claim we get stronger.

> > @@ -2514,15 +2452,29 @@ static u32 *gen8_emit_wa_tail(struct i915_request *request, u32 *cs)
> >       return cs;
> >  }
> >  
> > +static u32 *emit_preempt_busywait(struct i915_request *request, u32 *cs)
> > +{
> > +     *cs++ = MI_SEMAPHORE_WAIT |
> > +             MI_SEMAPHORE_GLOBAL_GTT |
> > +             MI_SEMAPHORE_POLL |
> > +             MI_SEMAPHORE_SAD_EQ_SDD;
> > +     *cs++ = 0;
> > +     *cs++ = intel_hws_preempt_address(request->engine);
> > +     *cs++ = 0;
> > +
> > +     return cs;
> > +}
> > +
> >  static u32 *gen8_emit_fini_breadcrumb(struct i915_request *request, u32 *cs)
> >  {
> >       cs = gen8_emit_ggtt_write(cs,
> >                                 request->fence.seqno,
> >                                 request->timeline->hwsp_offset,
> >                                 0);
> > -
> >       *cs++ = MI_USER_INTERRUPT;
> > +
> >       *cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
> 
> This was discussed in irc, could warrant a comment here of
> why this is needed. Precious info.

Why the ARB, for reasons of yore. The comment for why we need it is
actually in bb_start.

commit 279f5a00c9a9b39f4f6e9813e6d4da8c181d34c8
Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Date:   Thu Oct 5 20:10:05 2017 +0100

    drm/i915/execlists: Add a comment for the extra MI_ARB_ENABLE

> > +     cs = emit_preempt_busywait(request, cs);

Why we use the semaphore? That should be explained in dequeue upon
setting up the preemption.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx