Quoting Joonas Lahtinen (2017-09-28 15:10:03) > On Wed, 2017-09-27 at 17:44 +0100, Chris Wilson wrote: > > When we write to ELSP, it triggers a context preemption at the earliest > > arbitration point (3DPRIMITIVE, some PIPECONTROLs, a few other > > operations and the explicit MI_ARB_CHECK). If this is to the same > > context, it triggers a LITE_RESTORE where the RING_TAIL is merely > > updated (used currently to chain requests from the same context > > together, avoiding bubbles). However, if it is to a different context, a > > full context-switch is performed and it will start to execute the new > > context saving the image of the old for later execution. > > > > Previously we avoided preemption by only submitting a new context when > > the old was idle. But now we wish embrace it, and if the new request has > > a higher priority than the currently executing request, we write to the > > ELSP regardless, thus triggering preemption, but we tell the GPU to > > switch to our special preemption context (not the target). In the > > context-switch interrupt handler, we know that the previous contexts > > have finished execution and so can unwind all the incomplete requests > > and compute the new highest priority request to execute. > > > > It would be feasible to avoid the switch-to-idle intermediate by > > programming the ELSP with the target context. The difficulty is in > > tracking which request that should be whilst maintaining the dependency > > change, the error comes in with coalesced requests. As we only track the > > most recent request and its priority, we may run into the issue of being > > tricked in preempting a high priority request that was followed by a > > low priority request from the same context (e.g. for PI); > > "followed" is bit ambiguous here, depending on how you view the > ordering, wall time or ports. Not in this case. Same context == same timeline, i.e. fifo. :) > > worse still > > that earlier request may be our own dependency and the order then broken > > by preemption. By injecting the switch-to-idle and then recomputing the > > priority queue, we avoid the issue with tracking in-flight coalesced > > requests. Having tried the preempt-to-busy approach, and failed to find > > a way around the coalesced priority issue, Michal's original proposal to > > inject an idle context (based on handling GuC preemption) succeeds. > > > > The current heuristic for deciding when to preempt are only if the new > > request is of higher priority, and has the privileged priority of > > greater than 0. Note that the scheduler remains unfair! > > > > v2: Disable for gen8 (bdw/bsw) as we need additional w/a for GPGPU. > > Since, the feature is now conditional and not always available when we > > have a scheduler, make it known via the HAS_SCHEDULER GETPARAM (now a > > capability mask). > > > > Suggested-by: Michal Winiarski <michal.winiarski@xxxxxxxxx> > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Michal Winiarski <michal.winiarski@xxxxxxxxx> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > Cc: Arkadiusz Hiler <arkadiusz.hiler@xxxxxxxxx> > > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> > > Cc: Ben Widawsky <benjamin.widawsky@xxxxxxxxx> > > Cc: Zhenyu Wang <zhenyuw@xxxxxxxxxxxxxxx> > > Cc: Zhi Wang <zhi.a.wang@xxxxxxxxx> > > <SNIP> > > > @@ -489,26 +489,44 @@ static void port_assign(struct execlist_port *port, > > port_set(port, port_pack(i915_gem_request_get(rq), port_count(port))); > > } > > > > +static void inject_preempt_context(struct intel_engine_cs *engine) > > +{ > > + struct intel_context *ce = > > + &engine->i915->preempt_context->engine[engine->id]; > > + u32 __iomem *elsp = > > + engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine)); > > engine_elsp() helper or so? No. People doing this should suffer. > > + unsigned int n; > > + > > + GEM_BUG_ON(engine->i915->preempt_context->hw_id != PREEMPT_ID); > > I think this could/should be done way earlier? This is the earliest point in the sequence. We assert that the value that we stuff into the upper_32_bits(desc) will match the value we extract from the upper_32_bits(status). > > + > > + memset(ce->ring->vaddr + ce->ring->tail, 0, 8); > > + ce->ring->tail += 8; > > + ce->ring->tail &= (ce->ring->size - 1); > > + ce->lrc_reg_state[CTX_RING_TAIL+1] = ce->ring->tail; > > An awful lot of pre-expectations here, would be shame if somebody > documented them. Like which? HW requirement for qword aligned tailed updates, some HW requirement to ensure HEAD != TAIL. Have you seen the extensive commentary that preceded this function? > > @@ -696,7 +746,7 @@ static void intel_lrc_irq_handler(unsigned long data) > > { > > struct intel_engine_cs * const engine = (struct intel_engine_cs *)data; > > struct intel_engine_execlists * const execlists = &engine->execlists; > > - struct execlist_port *port = execlists->port; > > + struct execlist_port * const port = execlists->port; > > struct drm_i915_private *dev_priv = engine->i915; > > > > /* We can skip acquiring intel_runtime_pm_get() here as it was taken > > @@ -781,6 +831,23 @@ static void intel_lrc_irq_handler(unsigned long data) > > if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK)) > > continue; > > > > + if (status & GEN8_CTX_STATUS_ACTIVE_IDLE && > > + buf[2*head + 1] == PREEMPT_ID) { > > (2 * head + 1), could be a helper again, potentially with a descriptive > name so the below comment can be removed and doesn't need to be > duplicated :) Which comment? The comment for the motivation behind the BUG_ON? -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx