Quoting Bloomfield, Jon (2019-08-07 16:29:55) > > -----Original Message----- > > From: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Sent: Wednesday, August 7, 2019 8:08 AM > > To: Bloomfield, Jon <jon.bloomfield@xxxxxxxxx>; intel- > > gfx@xxxxxxxxxxxxxxxxxxxxx > > Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>; Winiarski, Michal > > <michal.winiarski@xxxxxxxxx> > > Subject: RE: [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close > > > > Quoting Bloomfield, Jon (2019-08-07 15:33:51) > > [skip to end] > > > We didn't explore the idea of terminating orphaned contexts though > > (where none of their resources are referenced by any other contexts). Is > > there a reason why this is not feasible? In the case of compute (certainly > > HPC) workloads, there would be no compositor taking the output so this > > might be a solution. > > > > Sounds easier said than done. We have to go through each request and > > determine it if has an external reference (or if the object holding the > > reference has an external reference) to see if the output would be > > visible to a third party. Sounds like a conservative GC :| > > (Coming to that conclusion suggests that we should structure the request > > tracking to make reparenting easier.) > > > > We could take a pid-1 approach and move all the orphan timelines over to > > a new parent purely responsible for them. That honestly doesn't seem to > > achieve anything. (We are still stuck with tasks on the GPU and no way > > to kill them.) > > > > In comparison, persistence is a rarely used "feature" and cleaning up on > > context close fits in nicely with the process model. It just works as > > most users/clients would expect. (Although running in non-persistent > > by default hasn't show anything to explode on the desktop, it's too easy > > to construct scenarios where persistence turns out to be an advantage, > > particularly with chains of clients (the compositor model).) Between the > > two modes, we should have most bases covered, it's hard to argue for a > > third way (that is until someone has a usecase!) > > -Chris > > Ok, makes sense. Thanks. > > But have we converged on a decision :-) > > As I said, requiring compute umd optin should be ok for the immediate HPC issue, but I'd personally argue that it's valid to change the contract for hangcheck=0 and switch the default to non-persistent. Could you tender diff --git a/runtime/os_interface/linux/drm_neo.cpp b/runtime/os_interface/linux/drm_neo.cpp index 31deb68b..8a9af363 100644 --- a/runtime/os_interface/linux/drm_neo.cpp +++ b/runtime/os_interface/linux/drm_neo.cpp @@ -141,11 +141,22 @@ void Drm::setLowPriorityContextParam(uint32_t drmContextId) { UNRECOVERABLE_IF(retVal != 0); } +void setNonPersistent(uint32_t drmContextId) { + drm_i915_gem_context_param gcp = {}; + gcp.ctx_id = drmContextId; + gcp.param = 0xb; /* I915_CONTEXT_PARAM_PERSISTENCE; */ + + ioctl(DRM_IOCTL_I915_GEM_CONTEXT_SETPARAM, &gcp); +} + uint32_t Drm::createDrmContext() { drm_i915_gem_context_create gcc = {}; auto retVal = ioctl(DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &gcc); UNRECOVERABLE_IF(retVal != 0); + /* enable cleanup of resources on process termination */ + setNonPersistent(gcc.ctx_id); + return gcc.ctx_id; } to interested parties? -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx