On Wed, Jan 20, 2016 at 02:50:47PM +0000, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > In GuC mode LRC pinning lifetime depends exclusively on the > request liftime. Since that is terminated by the seqno update > that opens up a race condition between GPU finishing writing > out the context image and the driver unpinning the LRC. > > To extend the LRC lifetime we will employ a similar approach > to what legacy ringbuffer submission does. > > We will start tracking the last submitted context per engine > and keep it pinned until it is replaced by another one. > > Note that the driver unload path is a bit fragile and could > benefit greatly from efforts to unify the legacy and exec > list submission code paths. > > At the moment i915_gem_context_fini has special casing for the > two which are potentialy not needed, and also depends on > i915_gem_cleanup_ringbuffer running before itself. > > v2: > * Move pinning into engine->emit_request and actually fix > the reference/unreference logic. (Chris Wilson) > > * ring->dev can be NULL on driver unload so use a different > route towards it. > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > Issue: VIZ-4277 > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Nick Hoath <nicholas.hoath@xxxxxxxxx> > --- > I cannot test this with GuC but it passes BAT with execlists > and some real world smoke tests. > --- > drivers/gpu/drm/i915/i915_gem_context.c | 4 +++- > drivers/gpu/drm/i915/intel_lrc.c | 9 ++++++++- > 2 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c > index c25083c78ba7..0b419e165836 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -438,7 +438,9 @@ void i915_gem_context_fini(struct drm_device *dev) > for (i = 0; i < I915_NUM_RINGS; i++) { > struct intel_engine_cs *ring = &dev_priv->ring[i]; > This is the nasty part where the GPU still has access to the backing pages as we release them. A hard to hit module-unload vs active GPU for sure, but it is something that we can prevent. The context-fini vs engine-fini ordering is also apparently tied to the use of intel_gpu_reset() here (as it clobbers the GPU without dropping the GEM state, causing a foul up when tearing down the engine). If we had actually called i915_reset (and so i915_gem_reset) instead, we should expect last_context to be NULL... Oops, Tvrtko you need to inspect i915_gem_context_reset() as well for reseting last_context back to NULL after the GPU reset. As for here, I would just pencil in a plan to replace this chunk entirely with a call to i915_reset(), or a slightly trimmed down version and tidy up the active GEM cleanup in the process. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx