Re: [PATCH 3/5] drm/i915/execlists: Direct submit onto idle engines

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Tvrtko Ursulin (2018-05-10 17:09:14)
> 
> On 09/05/2018 15:27, Chris Wilson wrote:
> > Bypass using the tasklet to submit the first request to HW, as the
> > tasklet may be deferred unto ksoftirqd and at a minimum will add in
> > excess of 10us (and maybe tens of milliseconds) to our execution
> > latency. This latency reduction is most notable when execution flows
> > between engines.
> > 
> > v2: Beware handling preemption completion from the direct submit path as
> > well.
> > v3: Make the abuse clear and track our extra state inside i915_tasklet.
> > 
> > Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> > ---
> >   drivers/gpu/drm/i915/i915_tasklet.h         | 24 +++++++
> >   drivers/gpu/drm/i915/intel_guc_submission.c | 10 ++-
> >   drivers/gpu/drm/i915/intel_lrc.c            | 71 +++++++++++++++++----
> >   3 files changed, 89 insertions(+), 16 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_tasklet.h b/drivers/gpu/drm/i915/i915_tasklet.h
> > index 42b002b88edb..99e2fa2241ba 100644
> > --- a/drivers/gpu/drm/i915/i915_tasklet.h
> > +++ b/drivers/gpu/drm/i915/i915_tasklet.h
> > @@ -8,8 +8,11 @@
> >   #define _I915_TASKLET_H_
> >   
> >   #include <linux/atomic.h>
> > +#include <linux/bitops.h>
> >   #include <linux/interrupt.h>
> >   
> > +#include "i915_gem.h"
> > +
> >   /**
> >    * struct i915_tasklet - wrapper around tasklet_struct
> >    *
> > @@ -19,6 +22,8 @@
> >    */
> >   struct i915_tasklet {
> >       struct tasklet_struct base;
> > +     unsigned long flags;
> > +#define I915_TASKLET_DIRECT_SUBMIT BIT(0)
> 
> I would suggest a more generic name for the bit since i915_tasklet is 
> generic-ish. For instance simply I915_TASKLET_DIRECT would signify the 
> callback has been invoked directly and not (necessarily) from softirq 
> context. Then it is for each user to know what that means for them 
> specifically.

Problem is we have two direct invocations, only one is special. It
really wants to be something like I915_TASKLET_ENGINE_IS_LOCKED - you can
see why I didn't propose that.

> > -static void __submit_queue(struct intel_engine_cs *engine, int prio)
> > +static void __wakeup_queue(struct intel_engine_cs *engine, int prio)
> >   {
> >       engine->execlists.queue_priority = prio;
> > +}
> 
> Why is this called wakeup? Plans to add something in it later?

Yes. It's called wakeup because it's setting the value that the dequeue
wakes up at. First name was kick_queue, but it doesn't kick either.

The later side-effect involves controlling timers.

__restart_queue()?

> > +static void __schedule_queue(struct intel_engine_cs *engine)
> > +{
> >       i915_tasklet_schedule(&engine->execlists.tasklet);
> >   }
> >   
> > +static bool __direct_submit(struct intel_engine_execlists *const execlists)
> > +{
> > +     struct i915_tasklet * const t = &execlists->tasklet;
> > +
> > +     if (!tasklet_trylock(&t->base))
> > +             return false;
> > +
> > +     t->flags |= I915_TASKLET_DIRECT_SUBMIT;
> > +     i915_tasklet_run(t);
> > +     t->flags &= ~I915_TASKLET_DIRECT_SUBMIT;
> > +
> > +     tasklet_unlock(&t->base);
> 
> Feels like this whole sequence belongs to i915_tasklet since it touches 
> the internals. Maybe i915_tasklet_try_run, or i915_tasklet_run_or_schedule?

Keep reading the series and you'll see just why this is so special and
confined to execlists.

> > +     return true;
> > +}
> > +
> > +static void __submit_queue(struct intel_engine_cs *engine)
> > +{
> > +     struct intel_engine_execlists * const execlists = &engine->execlists;
> > +
> > +     GEM_BUG_ON(!engine->i915->gt.awake);
> > +
> > +     /* If inside GPU reset, the tasklet will be queued later. */
> > +     if (!i915_tasklet_is_enabled(&execlists->tasklet))
> > +             return;
> > +
> > +     /* Directly submit the first request to reduce the initial latency */
> > +     if (port_isset(execlists->port) || !__direct_submit(execlists))
> > +             __schedule_queue(engine);
> 
> Hmm a bit evil to maybe invoke in the condition. Would it be acceptable to:
> 
> if (!port_isset(...))
>         i915_tasklet_run_or_schedule(...);
> else
>         i915_tasklet_schedule(...);
> 
> It's not ideal but maybe a bit better.

Beauty is in the eye of the beholder, and that ain't beautiful :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux