Quoting Chris Wilson (2019-10-24 14:40:18) > Our existing behaviour is to allow contexts and their GPU requests to > persist past the point of closure until the requests are complete. This > allows clients to operate in a 'fire-and-forget' manner where they can > setup a rendering pipeline and hand it over to the display server and > immediately exiting. As the rendering pipeline is kept alive until > completion, the display server (or other consumer) can use the results > in the future and present them to the user. > > However, not all clients want this persistent behaviour and would prefer > that the contexts are cleaned up immediately upon closure. This ensures > that when clients are run without hangchecking, any GPU hang is > terminated with the process and does not continue to hog resources. It's worth mentioning GPU compute workloads of indeterminate length as an example for increased clarity. > By defining a context property to allow clients to control persistence > explicitly, we can remove the blanket advice to disable hangchecking > that seems to be far too prevalent. I would drop this paragraph from this patch, as it doesn't (yet) make sense. > The default behaviour for new controls is the legacy persistence mode. > New clients will have to opt out for immediate cleanup on context > closure. "... have to opt in to immediate ..." reads more clear. > If the hangchecking modparam is disabled, so is persistent > context support -- all contexts will be terminated on closure. Let's add here that it has actually been a source of bug reports in the past that we don't terminate the workoads. So we expect this behaviour change to be welcomed by the compute users who have been instructed to disable the hanghceck in the past. A couple of comments below. But anyway, with the uAPI comment and commit message clarified this is: Reviewed-by: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> Still needs the Link:, though. > Testcase: igt/gem_ctx_persistence > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> > Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx> > Cc: Jon Bloomfield <jon.bloomfield@xxxxxxxxx> > Reviewed-by: Jon Bloomfield <jon.bloomfield@xxxxxxxxx> > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> <SNIP> > +static int __context_set_persistence(struct i915_gem_context *ctx, bool state) > +{ > + if (i915_gem_context_is_persistent(ctx) == state) > + return 0; > + > + if (state) { > + /* > + * Only contexts that are short-lived [that will expire or be > + * reset] are allowed to survive past termination. We require > + * hangcheck to ensure that the persistent requests are healthy. > + */ > + if (!i915_modparams.enable_hangcheck) > + return -EINVAL; This is slightly confusing as the default is to enable persistence. Disabling and re-enabling would result -EINVAL. But I guess it's no problem to lift such restriction later. > +++ b/include/uapi/drm/i915_drm.h > @@ -1572,6 +1572,21 @@ struct drm_i915_gem_context_param { > * i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND) > */ > #define I915_CONTEXT_PARAM_ENGINES 0xa > + > +/* > + * I915_CONTEXT_PARAM_PERSISTENCE: > + * > + * Allow the context and active rendering to survive the process until > + * completion. Persistence allows fire-and-forget clients to queue up a > + * bunch of work, hand the output over to a display server and the quit. > + * If the context is not marked as persistent, upon closing (either via ".. is marked as not persistent," is more clear. Regards, Joonas > + * an explicit DRM_I915_GEM_CONTEXT_DESTROY or implicitly from file closure > + * or process termination), the context and any outstanding requests will be > + * cancelled (and exported fences for cancelled requests marked as -EIO). > + * > + * By default, new contexts allow persistence. > + */ _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx