Quoting Tvrtko Ursulin (2018-05-08 11:23:09) > > On 07/05/2018 14:57, Chris Wilson wrote: > > Bypass using the tasklet to submit the first request to HW, as the > > tasklet may be deferred unto ksoftirqd and at a minimum will add in > > excess of 10us (and maybe tens of milliseconds) to our execution > > latency. This latency reduction is most notable when execution flows > > between engines. > > > > v2: Beware handling preemption completion from the direct submit path as > > well. > > > > Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/intel_guc_submission.c | 12 +++- > > drivers/gpu/drm/i915/intel_lrc.c | 66 +++++++++++++++++---- > > drivers/gpu/drm/i915/intel_ringbuffer.h | 7 +++ > > 3 files changed, 69 insertions(+), 16 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c > > index 2feb65096966..6bfe30af7826 100644 > > --- a/drivers/gpu/drm/i915/intel_guc_submission.c > > +++ b/drivers/gpu/drm/i915/intel_guc_submission.c > > @@ -754,14 +754,20 @@ static bool __guc_dequeue(struct intel_engine_cs *engine) > > > > static void guc_dequeue(struct intel_engine_cs *engine) > > { > > - unsigned long flags; > > + unsigned long uninitialized_var(flags); > > bool submit; > > > > local_irq_save(flags); > > > > - spin_lock(&engine->timeline.lock); > > + GEM_BUG_ON(!test_bit(TASKLET_STATE_RUN, > > + &engine->execlists.tasklet.state)); > > Soon it will be time for i915_tasklet. :) > > > + if (!intel_engine_direct_submit(engine)) > > + spin_lock(&engine->timeline.lock); > > A bit ugly both on the conditional locking and using engine->flags for > transient purposes. > > Since you are locking the tasklet and own it (and open coding the call) > completely when calling directly, you could just the same cheat and call > a different function? My first attempt was to call __execlists_dequeue() directly and not tasklet->func(). But that then has this nasty if (tasklet->func == execlists_submission_tasklet) or some such in the middle of otherwise generic code. https://patchwork.freedesktop.org/patch/221105/ I was less happy about that. At least this does have the making of something more generic like i915_tasklet ;) > > submit = __guc_dequeue(engine); > > - spin_unlock(&engine->timeline.lock); > > + > > + if (!intel_engine_direct_submit(engine)) > > + spin_unlock(&engine->timeline.lock); > > > > if (submit) > > guc_submit(engine); > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > > index 15c373ea5b7e..ac7c5edee4ee 100644 > > --- a/drivers/gpu/drm/i915/intel_lrc.c > > +++ b/drivers/gpu/drm/i915/intel_lrc.c > > @@ -357,13 +357,16 @@ execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists) > > { > > struct intel_engine_cs *engine = > > container_of(execlists, typeof(*engine), execlists); > > - unsigned long flags; > > + unsigned long uninitialized_var(flags); > > > > - spin_lock_irqsave(&engine->timeline.lock, flags); > > + GEM_BUG_ON(!test_bit(TASKLET_STATE_RUN, &execlists->tasklet.state)); > > + if (!intel_engine_direct_submit(engine)) > > + spin_lock_irqsave(&engine->timeline.lock, flags); > > > > __unwind_incomplete_requests(engine); > > > > - spin_unlock_irqrestore(&engine->timeline.lock, flags); > > + if (!intel_engine_direct_submit(engine)) > > + spin_unlock_irqrestore(&engine->timeline.lock, flags); > > Hm ok yes, this one would be a problem.. > > Maybe at least use some bit under execlists state instead of engine flags? But I have engine->flags :-p Could I steal a bit from tasklet.state? I tend to get funny looks everytime I ask for TASKLET_STATE_USER ;) -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx