Quoting Tvrtko Ursulin (2018-05-10 18:26:31) > > On 10/05/2018 17:25, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2018-05-10 17:09:14) > >> > >> On 09/05/2018 15:27, Chris Wilson wrote: > >>> Bypass using the tasklet to submit the first request to HW, as the > >>> tasklet may be deferred unto ksoftirqd and at a minimum will add in > >>> excess of 10us (and maybe tens of milliseconds) to our execution > >>> latency. This latency reduction is most notable when execution flows > >>> between engines. > >>> > >>> v2: Beware handling preemption completion from the direct submit path as > >>> well. > >>> v3: Make the abuse clear and track our extra state inside i915_tasklet. > >>> > >>> Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >>> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >>> --- > >>> drivers/gpu/drm/i915/i915_tasklet.h | 24 +++++++ > >>> drivers/gpu/drm/i915/intel_guc_submission.c | 10 ++- > >>> drivers/gpu/drm/i915/intel_lrc.c | 71 +++++++++++++++++---- > >>> 3 files changed, 89 insertions(+), 16 deletions(-) > >>> > >>> diff --git a/drivers/gpu/drm/i915/i915_tasklet.h b/drivers/gpu/drm/i915/i915_tasklet.h > >>> index 42b002b88edb..99e2fa2241ba 100644 > >>> --- a/drivers/gpu/drm/i915/i915_tasklet.h > >>> +++ b/drivers/gpu/drm/i915/i915_tasklet.h > >>> @@ -8,8 +8,11 @@ > >>> #define _I915_TASKLET_H_ > >>> > >>> #include <linux/atomic.h> > >>> +#include <linux/bitops.h> > >>> #include <linux/interrupt.h> > >>> > >>> +#include "i915_gem.h" > >>> + > >>> /** > >>> * struct i915_tasklet - wrapper around tasklet_struct > >>> * > >>> @@ -19,6 +22,8 @@ > >>> */ > >>> struct i915_tasklet { > >>> struct tasklet_struct base; > >>> + unsigned long flags; > >>> +#define I915_TASKLET_DIRECT_SUBMIT BIT(0) > >> > >> I would suggest a more generic name for the bit since i915_tasklet is > >> generic-ish. For instance simply I915_TASKLET_DIRECT would signify the > >> callback has been invoked directly and not (necessarily) from softirq > >> context. Then it is for each user to know what that means for them > >> specifically. > > > > Problem is we have two direct invocations, only one is special. It > > really wants to be something like I915_TASKLET_ENGINE_IS_LOCKED - you can > > see why I didn't propose that. > > TBC... > > >>> -static void __submit_queue(struct intel_engine_cs *engine, int prio) > >>> +static void __wakeup_queue(struct intel_engine_cs *engine, int prio) > >>> { > >>> engine->execlists.queue_priority = prio; > >>> +} > >> > >> Why is this called wakeup? Plans to add something in it later? > > > > Yes. It's called wakeup because it's setting the value that the dequeue > > wakes up at. First name was kick_queue, but it doesn't kick either. > > > > The later side-effect involves controlling timers. > > > > __restart_queue()? > > __update_queue_priority? :) It doesn't just update the priority... Now a choice between restart_queue and update_queue. > >>> +static void __schedule_queue(struct intel_engine_cs *engine) > >>> +{ > >>> i915_tasklet_schedule(&engine->execlists.tasklet); > >>> } > >>> > >>> +static bool __direct_submit(struct intel_engine_execlists *const execlists) > >>> +{ > >>> + struct i915_tasklet * const t = &execlists->tasklet; > >>> + > >>> + if (!tasklet_trylock(&t->base)) > >>> + return false; > >>> + > >>> + t->flags |= I915_TASKLET_DIRECT_SUBMIT; > >>> + i915_tasklet_run(t); > >>> + t->flags &= ~I915_TASKLET_DIRECT_SUBMIT; > >>> + > >>> + tasklet_unlock(&t->base); > >> > >> Feels like this whole sequence belongs to i915_tasklet since it touches > >> the internals. Maybe i915_tasklet_try_run, or i915_tasklet_run_or_schedule? > > > > Keep reading the series and you'll see just why this is so special and > > confined to execlists. > > ... TBC here. > > Having peeked ahead, it feels a bit not generic enough as it is, a bit > too hacky. > > Would it work to pass context together with the invocation. Like: > > i915_tasklet_try(..., I915_TASKLET_SUBMIT_IDLE); > i915_tasklet_try(..., I915_TASKLET_SUBMIT_IRQ); > > i915_tasklet.flags field namespace would then be owned by the caller > completely. And the tasklet func itself would have more context on what > to do. That doesn't apply very well to the use case either. It's not the tasklet being called from irq/process that's significant but whether we are calling it with the engine/data locked. I keep wanting to use LOCKED, but that has no meaning to the tasklet, and tasklet_trylock means something entirely different. > Following form that, i915_tasklet_run_or_schedule(.., flags). > > bool i915_taskle_try(tasklet, flags) > { > if (!trylock) > return false; > > t->flags |= flags; > i915_tasklet_run(...); > t->flags &= ~flags; > > tasklet_unlock(...); > > return true; > } > > > void i915_tasklet_run_or_schedule(..., flags) > { > if (!i915_tasklet_try(..., flags)) > i915_tasklet_schedule(...); > } > > ? > > Leaves a question of a tasklet_is_enabled check in your tasklet_try, > which I don't quite get since that check wasn't there before. So why it > is needed? Concurrent reset can happen at irq time, but we know cannot happen from the submit path. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx