Re: [PATCH v2 2/7] drm/i915: Disable tasklet scheduling across initial scheduling

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Tue, 08 May 2018 11:31:08 +0100

Quoting Tvrtko Ursulin (2018-05-08 11:02:38)
> 
> On 07/05/2018 14:57, Chris Wilson wrote:
> > During request submission, we call the engine->schedule() function so
> > that we may reorder the active requests as required for inheriting the
> > new request's priority. This may schedule several tasklets to run on the
> > local CPU, but we will need to schedule the tasklets again for the new
> > request. Delay all the local tasklets until the end, so that we only
> > have to process the queue just once.
> > 
> > v2: Beware PREEMPT_RCU, as then local_bh_disable() is then not a
> > superset of rcu_read_lock().
> > 
> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx>
> > ---
> >   drivers/gpu/drm/i915/i915_request.c | 5 ++---
> >   1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index e4cf76ec14a6..f336942229cf 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -1110,12 +1110,11 @@ void __i915_request_add(struct i915_request *request, bool flush_caches)
> >        * decide whether to preempt the entire chain so that it is ready to
> >        * run at the earliest possible convenience.
> >        */
> > -     rcu_read_lock();
> > +     local_bh_disable();
> > +     rcu_read_lock(); /* RCU serialisation for set-wedged protection */
> >       if (engine->schedule)
> >               engine->schedule(request, &request->ctx->sched);
> >       rcu_read_unlock();
> > -
> > -     local_bh_disable();
> >       i915_sw_fence_commit(&request->submit);
> >       local_bh_enable(); /* Kick the execlists tasklet if just scheduled */
> >   
> > 
> 
> Before I wasn't sure about this one, since it lengthens the softirq off 
> section and still doesn't make the re-schedule atomic. But today I am 
> thinking that in normal use dependency chains should not be that deep 
> for it to become an issue. In the light of that, go for it:

We have big problems when the dependency chain grows, simply because we
recursively build a linear list and then walk it again. It's not the
worst nightmare in the code, and I don't think anyone has complained
about it yet, but we do have a few pathological igt (gem_exec_schedule
deep/wide) to show that it can be the ratelimiting step.

As far as blocking tasklets across this; only the local tasklet is
blocked and for good reason as we reorder the queues. A tasklet on
another CPU is also (mostly) blocked by the spinlock (and local irq
off). Overall I don't think the problem is exacerbated at all by the
local_bh_disable().
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx