Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > Quoting Mika Kuoppala (2019-06-20 13:41:26) >> Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: >> > @@ -38,6 +39,10 @@ struct intel_context { >> > struct i915_gem_context *gem_context; >> > struct intel_engine_cs *engine; >> > struct intel_engine_cs *inflight; >> > +#define intel_context_inflight(ce) ptr_mask_bits((ce)->inflight, 2) >> > +#define intel_context_inflight_count(ce) ptr_unmask_bits((ce)->inflight, 2) >> > +#define intel_context_inflight_inc(ce) ptr_count_inc(&(ce)->inflight) >> > +#define intel_context_inflight_dec(ce) ptr_count_dec(&(ce)->inflight) >> >> Just curious here that what you consider the advantages of carrying >> this info with the pointer? > > Packing. I just need a bit to track status, and one for overflow. > >> > +static inline u32 intel_hws_preempt_address(struct intel_engine_cs *engine) >> > +{ >> > + return (i915_ggtt_offset(engine->status_page.vma) + >> > + I915_GEM_HWS_PREEMPT_ADDR); >> > +} >> > + >> > +#define ring_pause(E) ((E)->status_page.addr[I915_GEM_HWS_PREEMPT]) >> >> Scary. Please lets make a function of ring_pause and use >> intel_write_status_page in it. > > I'd rather not unless you do __intel_write_state_page. > >> So I guess you have and you want squeeze the latency fruit. >> >> When we have everything in place, CI is green and >> everyone is happy, then we tear it down? > > Been there, done that. My fears come from csb. Granted, it is a diffent thing with a different direction of writes. > >> > @@ -442,13 +443,11 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine) >> > struct intel_engine_cs *owner; >> > >> > if (i915_request_completed(rq)) >> > - break; >> > + continue; /* XXX */ >> >> Yeah, but what is the plan with the XXX. > > Mulling over tracking context not requests. We still end up with having > to scan history within a context, so not yet seeing anything to > encourage me to make the change. I worry about long request queues > causing preemption latency, as this list is currently only trimmed in > retirement. > > One idea in the background is for a scheduler (contemplating something > like the isosynchronous MuQSS) and that might call for a change to > using contexts as the primary, with requests within the contexts. > >> > @@ -1223,68 +1217,37 @@ static void process_csb(struct intel_engine_cs *engine) >> > * status notifier. >> > */ >> > >> > - GEM_TRACE("%s csb[%d]: status=0x%08x:0x%08x, active=0x%x\n", >> > + GEM_TRACE("%s csb[%d]: status=0x%08x:0x%08x\n", >> > engine->name, head, >> > - buf[2 * head + 0], buf[2 * head + 1], >> > - execlists->active); >> > + buf[2 * head + 0], buf[2 * head + 1]); >> > >> > status = buf[2 * head]; >> > - if (status & (GEN8_CTX_STATUS_IDLE_ACTIVE | >> > - GEN8_CTX_STATUS_PREEMPTED)) >> > - execlists_set_active(execlists, >> > - EXECLISTS_ACTIVE_HWACK); >> > - if (status & GEN8_CTX_STATUS_ACTIVE_IDLE) >> > - execlists_clear_active(execlists, >> > - EXECLISTS_ACTIVE_HWACK); >> > - >> > - if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK)) >> > - continue; >> > + if (status & GEN8_CTX_STATUS_IDLE_ACTIVE) { >> > +promote: >> > + GEM_BUG_ON(!assert_pending_valid(execlists, "promote")); >> > + execlists->active = >> > + memcpy(execlists->inflight, >> > + execlists->pending, >> > + execlists_num_ports(execlists) * >> > + sizeof(*execlists->pending)); >> > + execlists->pending[0] = NULL; >> >> I can't decide if comment or a helper inline function would >> serve better as documentation of between inflight and pending >> movement. > > The magic is just this function, I think process_csb() reads quite > nicely with the 3 branches and switching between different states. It's > about 8 lines without the comments and asserts. > Agreed. It is more compact and more readable with this patch. >> I guess it is better to be left as a future work after >> the dust settles. >> >> Just general yearning for a similar kind of level of documentation >> steps as in dequeue. >> >> > >> > - /* We should never get a COMPLETED | IDLE_ACTIVE! */ >> > - GEM_BUG_ON(status & GEN8_CTX_STATUS_IDLE_ACTIVE); >> >> Is our assert coverage going to suffer? > > You've looked at the added asserts and tracing; I claim we get stronger. > >> > @@ -2514,15 +2452,29 @@ static u32 *gen8_emit_wa_tail(struct i915_request *request, u32 *cs) >> > return cs; >> > } >> > >> > +static u32 *emit_preempt_busywait(struct i915_request *request, u32 *cs) >> > +{ >> > + *cs++ = MI_SEMAPHORE_WAIT | >> > + MI_SEMAPHORE_GLOBAL_GTT | >> > + MI_SEMAPHORE_POLL | >> > + MI_SEMAPHORE_SAD_EQ_SDD; >> > + *cs++ = 0; >> > + *cs++ = intel_hws_preempt_address(request->engine); >> > + *cs++ = 0; >> > + >> > + return cs; >> > +} >> > + >> > static u32 *gen8_emit_fini_breadcrumb(struct i915_request *request, u32 *cs) >> > { >> > cs = gen8_emit_ggtt_write(cs, >> > request->fence.seqno, >> > request->timeline->hwsp_offset, >> > 0); >> > - >> > *cs++ = MI_USER_INTERRUPT; >> > + >> > *cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE; >> >> This was discussed in irc, could warrant a comment here of >> why this is needed. Precious info. > > Why the ARB, for reasons of yore. The comment for why we need it is > actually in bb_start. > > commit 279f5a00c9a9b39f4f6e9813e6d4da8c181d34c8 > Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Date: Thu Oct 5 20:10:05 2017 +0100 > > drm/i915/execlists: Add a comment for the extra MI_ARB_ENABLE Ok, looks like it. I do like the new way of handling ports. Reviewed-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > > >> > + cs = emit_preempt_busywait(request, cs); > > Why we use the semaphore? That should be explained in dequeue upon > setting up the preemption. > -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx