Quoting Tvrtko Ursulin (2018-10-01 11:38:37) > > On 28/09/2018 14:58, Chris Wilson wrote: > > When submitting chains to each engine, we can do so (mostly) in > > parallel, so delegate submission to threads on a per-engine basis. > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > --- > > drivers/gpu/drm/i915/selftests/intel_lrc.c | 73 ++++++++++++++++++---- > > 1 file changed, 61 insertions(+), 12 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c > > index 3a474bb64c05..d68a924c530e 100644 > > --- a/drivers/gpu/drm/i915/selftests/intel_lrc.c > > +++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c > > @@ -587,8 +587,10 @@ static int random_priority(struct rnd_state *rnd) > > struct preempt_smoke { > > struct drm_i915_private *i915; > > struct i915_gem_context **contexts; > > + struct intel_engine_cs *engine; > > unsigned int ncontext; > > struct rnd_state prng; > > + unsigned long count; > > }; > > > > static struct i915_gem_context *smoke_context(struct preempt_smoke *smoke) > > @@ -597,31 +599,78 @@ static struct i915_gem_context *smoke_context(struct preempt_smoke *smoke) > > &smoke->prng)]; > > } > > > > +static int smoke_crescendo_thread(void *arg) > > +{ > > + struct preempt_smoke *smoke = arg; > > + IGT_TIMEOUT(end_time); > > + unsigned long count; > > + > > + count = 0; > > + do { > > + struct i915_gem_context *ctx = smoke_context(smoke); > > + struct i915_request *rq; > > + > > + mutex_lock(&smoke->i915->drm.struct_mutex); > > + > > + ctx->sched.priority = count % I915_PRIORITY_MAX; > > + > > + rq = i915_request_alloc(smoke->engine, ctx); > > + if (IS_ERR(rq)) { > > + mutex_unlock(&smoke->i915->drm.struct_mutex); > > + return PTR_ERR(rq); > > + } > > + > > + i915_request_add(rq); > > + > > + mutex_unlock(&smoke->i915->drm.struct_mutex); > > + > > + count++; > > Very little outside the mutex so I am not sure if parallelization will > work that well. Every thread could probably fill the ring in it's > timeslice? Very unlikely due to the randomised ring. And we are working on that, right? :) > And then it blocks the others until there is space. It will > heavily rely on scheduler behaviour and mutex fairness I think. But it does bring the overall subtest time from num_engines to 1s, for the same pattern. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx