Re: [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Wed, 07 Aug 2019 16:38:42 +0100



Quoting Bloomfield, Jon (2019-08-07 16:29:55)
> > -----Original Message-----
> > From: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > Sent: Wednesday, August 7, 2019 8:08 AM
> > To: Bloomfield, Jon <jon.bloomfield@xxxxxxxxx>; intel-
> > gfx@xxxxxxxxxxxxxxxxxxxxx
> > Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>; Winiarski, Michal
> > <michal.winiarski@xxxxxxxxx>
> > Subject: RE: [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close
> > 
> > Quoting Bloomfield, Jon (2019-08-07 15:33:51)
> > [skip to end]
> > > We didn't explore the idea of terminating orphaned contexts though
> > (where none of their resources are referenced by any other contexts). Is
> > there a reason why this is not feasible? In the case of compute (certainly
> > HPC) workloads, there would be no compositor taking the output so this
> > might be a solution.
> > 
> > Sounds easier said than done. We have to go through each request and
> > determine it if has an external reference (or if the object holding the
> > reference has an external reference) to see if the output would be
> > visible to a third party. Sounds like a conservative GC :|
> > (Coming to that conclusion suggests that we should structure the request
> > tracking to make reparenting easier.)
> > 
> > We could take a pid-1 approach and move all the orphan timelines over to
> > a new parent purely responsible for them. That honestly doesn't seem to
> > achieve anything. (We are still stuck with tasks on the GPU and no way
> > to kill them.)
> > 
> > In comparison, persistence is a rarely used "feature" and cleaning up on
> > context close fits in nicely with the process model. It just works as
> > most users/clients would expect. (Although running in non-persistent
> > by default hasn't show anything to explode on the desktop, it's too easy
> > to construct scenarios where persistence turns out to be an advantage,
> > particularly with chains of clients (the compositor model).) Between the
> > two modes, we should have most bases covered, it's hard to argue for a
> > third way (that is until someone has a usecase!)
> > -Chris
> 
> Ok, makes sense. Thanks.
> 
> But have we converged on a decision :-)
> 
> As I said, requiring compute umd optin should be ok for the immediate HPC issue, but I'd personally argue that it's valid to change the contract for hangcheck=0 and switch the default to non-persistent.

I don't have to like it, but I think that's what we have to do for the
interim 10 years or so.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx