Re: [PATCH 4/5] drm/i915: Fixup preempt-to-busy vs reset of a virtual request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Tvrtko Ursulin (2019-09-23 14:46:27)
> 
> On 21/09/2019 10:55, Chris Wilson wrote:
> > Due to the nature of preempt-to-busy the execlists active tracking and
> > the schedule queue may become temporarily desync'ed (between resubmission
> > to HW and its ack from HW). This means that we may have unwound a
> > request and passed it back to the virtual engine, but it is still
> > inflight on the HW and may even result in a GPU hang. If we detect that
> > GPU hang and try to reset, the hanging request->engine will no longer
> > match the current engine, which means that the request is not on the
> > execlists active list and we should not try to find an older incomplete
> > request. Given that we have deduced this must be a request on a virtual
> > engine, it is the single active request in the context and so must be
> > guilty (as the context is still inflight, it is prevented from being
> > executed on another engine as we process the reset).
> > 
> > Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_lrc.c   | 7 +++++--
> >   drivers/gpu/drm/i915/gt/intel_reset.c | 4 +---
> >   2 files changed, 6 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index 1b2bacc60300..3eadd294bcd7 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -2326,11 +2326,14 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
> >   
> >   static struct i915_request *active_request(struct i915_request *rq)
> >   {
> > -     const struct list_head * const list =
> > -             &i915_request_active_timeline(rq)->requests;
> >       const struct intel_context * const ce = rq->hw_context;
> >       struct i915_request *active = NULL;
> > +     struct list_head *list;
> >   
> > +     if (!i915_request_is_active(rq)) /* unwound, but incomplete! */
> 
> Is it time to store the veng pointer separately in the request so we can 
> add the assert here? Like 
> GEM_BUG_ON(!...engine_is_virtual(rq->orig_engine)) ?

intel_engine_is_virtual(rq->hw_context->engine).

But this is currently agnostic, and here we are only asking question is
this request in the engine->active.list, which applies to both normal
and virtual setups. Hmm, this bug does apply to normal reset, I only
spotted it because we added the lockdep warnings that picked up that
rq->engine was different.

There maybe some use for that assertion, but I don't think this is one.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux