Quoting Tvrtko Ursulin (2020-02-05 16:22:34) > > > On 05/02/2020 12:13, Chris Wilson wrote: > > If we have a set of active engines marked as being non-persistent, we > > lose track of those if the user replaces those engines with > > I915_CONTEXT_PARAM_ENGINES. As part of our uABI contract is that > > non-persistent requests are terminated if they are no longer being > > tracked by the user's context (in order to prevent a lost request > > causing an untracked and so unstoppable GPU hang), we need to apply the > > same context cancellation upon changing engines. > > > > Fixes: a0e047156cde ("drm/i915/gem: Make context persistence optional") > > Testcase: XXX > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > index 52a749691a8d..20f1d3e0221f 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > @@ -1624,11 +1624,18 @@ set_engines(struct i915_gem_context *ctx, > > > > replace: > > mutex_lock(&ctx->engines_mutex); > > + > > + /* Flush stale requests off the old engines if required */ > > + if (!i915_gem_context_is_persistent(ctx) || > > + !i915_modparams.enable_hangcheck) > > + kill_context(ctx); > > Is the negative effect of this is legit contexts can't keep submitting > and changing the map? Only if PREEMPT_TIMEOUT is disabled I think but > still. Might break legitimate userspace. Not that I offer solutions.. :( > Banning changing engines once context went non-persistent? That too can > break someone. It closes the hole we have. To do otherwise, we need to keep track of the old engines. Not an impossible task, certainly inconvenient. struct old_engines { struct i915_active active; struct list_head link; struct i915_gem_context *ctx; void *engines; int num_engines; }; With a list+spinlock in the ctx that we can work in kill_context. The biggest catch there is actually worrying about attaching the active to already executing request, and making sure the coupling doesn't bug on a concurrent completion. Hmm, it's just a completion callback, but more convenient to use a ready made one. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx