Re: [PATCH v2] drm/i915/gem: Don't leak non-persistent requests on changing engines

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Tvrtko Ursulin (2020-02-05 16:22:34)
> 
> 
> On 05/02/2020 12:13, Chris Wilson wrote:
> > If we have a set of active engines marked as being non-persistent, we
> > lose track of those if the user replaces those engines with
> > I915_CONTEXT_PARAM_ENGINES. As part of our uABI contract is that
> > non-persistent requests are terminated if they are no longer being
> > tracked by the user's context (in order to prevent a lost request
> > causing an untracked and so unstoppable GPU hang), we need to apply the
> > same context cancellation upon changing engines.
> > 
> > Fixes: a0e047156cde ("drm/i915/gem: Make context persistence optional")
> > Testcase: XXX
> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++++++
> >   1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index 52a749691a8d..20f1d3e0221f 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -1624,11 +1624,18 @@ set_engines(struct i915_gem_context *ctx,
> >   
> >   replace:
> >       mutex_lock(&ctx->engines_mutex);
> > +
> > +     /* Flush stale requests off the old engines if required */
> > +     if (!i915_gem_context_is_persistent(ctx) ||
> > +         !i915_modparams.enable_hangcheck)
> > +             kill_context(ctx);
> 
> Is the negative effect of this is legit contexts can't keep submitting 
> and changing the map? Only if PREEMPT_TIMEOUT is disabled I think but 
> still. Might break legitimate userspace. Not that I offer solutions.. :( 
> Banning changing engines once context went non-persistent? That too can 
> break someone.

It closes the hole we have. To do otherwise, we need to keep track of
the old engines. Not an impossible task, certainly inconvenient.

struct old_engines {
	struct i915_active active;
	struct list_head link;
	struct i915_gem_context *ctx;
	void *engines;
	int num_engines;
};

With a list+spinlock in the ctx that we can work in kill_context.

The biggest catch there is actually worrying about attaching the active
to already executing request, and making sure the coupling doesn't bug
on a concurrent completion. Hmm, it's just a completion callback, but
more convenient to use a ready made one.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux