Quoting Tvrtko Ursulin (2018-10-01 11:51:23) > > On 19/09/2018 20:55, Chris Wilson wrote: > > Inside the execlists submission tasklet, we often make the mistake of > > assuming that everything beneath the request is available for use. > > However, the submission and the request live on two separate timelines, > > and the request contents may be freed from an early retirement before we > > have had a chance to run the submission tasklet (think ksoftirqd). To > > safeguard ourselves against any mistakes, flush the tasklet before we > > unpin the context if execlists still has a reference to this context. > > > > References: 60367132a214 ("drm/i915: Avoid use-after-free of ctx in request tracepoints") > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/i915_gem_context.h | 1 + > > drivers/gpu/drm/i915/intel_lrc.c | 32 ++++++++++++++++++++++++- > > 2 files changed, 32 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h > > index 9f89119a6566..1fd71dfdfa62 100644 > > --- a/drivers/gpu/drm/i915/i915_gem_context.h > > +++ b/drivers/gpu/drm/i915/i915_gem_context.h > > @@ -170,6 +170,7 @@ struct i915_gem_context { > > /** engine: per-engine logical HW state */ > > struct intel_context { > > struct i915_gem_context *gem_context; > > + struct intel_engine_cs *active; > > struct i915_vma *state; > > struct intel_ring *ring; > > u32 *lrc_reg_state; > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > > index 48a2bca7fec3..be7dbdd7fc2c 100644 > > --- a/drivers/gpu/drm/i915/intel_lrc.c > > +++ b/drivers/gpu/drm/i915/intel_lrc.c > > @@ -282,6 +282,8 @@ static void __unwind_incomplete_requests(struct intel_engine_cs *engine) > > __i915_request_unsubmit(rq); > > unwind_wa_tail(rq); > > > > + GEM_BUG_ON(rq->hw_context->active); > > + > > GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID); > > if (rq_prio(rq) != prio) { > > prio = rq_prio(rq); > > @@ -427,8 +429,11 @@ static void execlists_submit_ports(struct intel_engine_cs *engine) > > rq = port_unpack(&port[n], &count); > > if (rq) { > > GEM_BUG_ON(count > !n); > > - if (!count++) > > + if (!count++) { > > + GEM_BUG_ON(rq->hw_context->active); > > execlists_context_schedule_in(rq); > > + rq->hw_context->active = engine; > > Put it in execlists_context_schedule_in/out? > > Why does it have to be the engine pointer and not just a boolean? > Because we don't have an engine backpointer in hw_context? Should we add > it? I think I had occasionally wished we had it.. maybe too much work to > evaluate what function prototypes we could clean up with it and whether > it would be an overall gain. It's a backpointer in hw_context for the purposes of being a backpointer in hw_context. Its purpose is for: active = READ_ONCE(ve->context.active); if (active && active != engine) { rb = rb_next(rb); continue; } > > + } > > port_set(&port[n], port_pack(rq, count)); > > desc = execlists_update_context(rq); > > GEM_DEBUG_EXEC(port[n].context_id = upper_32_bits(desc)); > > @@ -734,6 +739,8 @@ execlists_cancel_port_requests(struct intel_engine_execlists * const execlists) > > intel_engine_get_seqno(rq->engine)); > > > > GEM_BUG_ON(!execlists->active); > > + > > + rq->hw_context->active = NULL; > > execlists_context_schedule_out(rq, > > i915_request_completed(rq) ? > > INTEL_CONTEXT_SCHEDULE_OUT : > > @@ -971,6 +978,7 @@ static void process_csb(struct intel_engine_cs *engine) > > */ > > GEM_BUG_ON(!i915_request_completed(rq)); > > > > + rq->hw_context->active = NULL; > > execlists_context_schedule_out(rq, > > INTEL_CONTEXT_SCHEDULE_OUT); > > i915_request_put(rq); > > @@ -1080,6 +1088,28 @@ static void execlists_context_destroy(struct intel_context *ce) > > > > static void execlists_context_unpin(struct intel_context *ce) > > { > > + struct intel_engine_cs *engine; > > + > > + /* > > + * The tasklet may still be using a pointer to our state, via an > > + * old request. However, since we know we only unpin the context > > + * on retirement of the following request, we know that the last > > + * request referencing us will have had a completion CS interrupt. > > Hm hm hm... my initial thought was that interrupts could be more delayed > than breadcrumb writes (more than one context ahead), in which case the > process_csb below could be premature and the assert would trigger. But I > must be forgetting something since that would also mean we would > prematurely unpin the context. So I must be forgetting something.. There will have been at least one CS event written because we have switched contexts due to the unpin being one seqno behind. I have not (yet) observed CS events being out of order with breadcrumb writes (and we have very strict checking) so confident that a single process_csb() is required rather than a loop. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx