On Thu, Oct 08, 2015 at 03:03:11PM +0100, Tvrtko Ursulin wrote: > > Hi, > > On 08/10/15 14:45, Ville Syrjälä wrote: > > On Mon, Oct 05, 2015 at 01:26:36PM +0100, Tvrtko Ursulin wrote: > >> From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >> > >> Prevent leaking VMAs and PPGTT VMs when objects are imported > >> via flink. > >> > >> Scenario is that any VMAs created by the importer will be left > >> dangling after the importer exits, or destroys the PPGTT context > >> with which they are associated. > >> > >> This is caused by object destruction not running when the > >> importer closes the buffer object handle due the reference held > >> by the exporter. This also leaks the VM since the VMA has a > >> reference on it. > >> > >> In practice these leaks can be observed by stopping and starting > >> the X server on a kernel with fbcon compiled in. Every time > >> X server exits another VMA will be leaked against the fbcon's > >> frame buffer object. > >> > >> Also on systems where flink buffer sharing is used extensively, > >> like Android, this leak has even more serious consequences. > >> > >> This version is takes a general approach from the earlier work > >> by Rafael Barbalho (drm/i915: Clean-up PPGTT on context > >> destruction) and tries to incorporate the subsequent discussion > >> between Chris Wilson and Daniel Vetter. > >> > >> v2: > >> > >> Removed immediate cleanup on object retire - it was causing a > >> recursive VMA unbind via i915_gem_object_wait_rendering. And > >> it is in fact not even needed since by definition context > >> cleanup worker runs only after the last context reference has > >> been dropped, hence all VMAs against the VM belonging to the > >> context are already on the inactive list. > >> > >> v3: > >> > >> Previous version could deadlock since VMA unbind waits on any > >> rendering on an object to complete. Objects can be busy in a > >> different VM which would mean that the cleanup loop would do > >> the wait with the struct mutex held. > >> > >> This is an even simpler approach where we just unbind VMAs > >> without waiting since we know all VMAs belonging to this VM > >> are idle, and there is nothing in flight, at the point > >> context destructor runs. > >> > >> v4: > >> > >> Double underscore prefix for __915_vma_unbind_no_wait and a > >> commit message typo fix. (Michel Thierry) > >> > >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >> Testcase: igt/gem_ppgtt.c/flink-and-exit-vma-leak > >> Reviewed-by: Michel Thierry <michel.thierry@xxxxxxxxx> > >> Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> > >> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >> Cc: Rafael Barbalho <rafael.barbalho@xxxxxxxxx> > >> Cc: Michel Thierry <michel.thierry@xxxxxxxxx> > > > >> --- > >> drivers/gpu/drm/i915/i915_drv.h | 5 +++++ > >> drivers/gpu/drm/i915/i915_gem.c | 20 ++++++++++++++++---- > >> drivers/gpu/drm/i915/i915_gem_context.c | 24 ++++++++++++++++++++++++ > >> 3 files changed, 45 insertions(+), 4 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > >> index 824e7245044d..58afa1bb2957 100644 > >> --- a/drivers/gpu/drm/i915/i915_drv.h > >> +++ b/drivers/gpu/drm/i915/i915_drv.h > >> @@ -2840,6 +2840,11 @@ i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj, > >> int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level, > >> u32 flags); > >> int __must_check i915_vma_unbind(struct i915_vma *vma); > >> +/* > >> + * BEWARE: Do not use the function below unless you can _absolutely_ > >> + * _guarantee_ VMA in question is _not in use_ anywhere. > >> + */ > >> +int __must_check __i915_vma_unbind_no_wait(struct i915_vma *vma); > >> int i915_gem_object_put_pages(struct drm_i915_gem_object *obj); > >> void i915_gem_release_all_mmaps(struct drm_i915_private *dev_priv); > >> void i915_gem_release_mmap(struct drm_i915_gem_object *obj); > >> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > >> index f0cfbb9ee12c..52642aff1dab 100644 > >> --- a/drivers/gpu/drm/i915/i915_gem.c > >> +++ b/drivers/gpu/drm/i915/i915_gem.c > >> @@ -3208,7 +3208,7 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj) > >> old_write_domain); > >> } > >> > >> -int i915_vma_unbind(struct i915_vma *vma) > >> +static int __i915_vma_unbind(struct i915_vma *vma, bool wait) > >> { > >> struct drm_i915_gem_object *obj = vma->obj; > >> struct drm_i915_private *dev_priv = obj->base.dev->dev_private; > >> @@ -3227,9 +3227,11 @@ int i915_vma_unbind(struct i915_vma *vma) > >> > >> BUG_ON(obj->pages == NULL); > >> > >> - ret = i915_gem_object_wait_rendering(obj, false); > >> - if (ret) > >> - return ret; > >> + if (wait) { > >> + ret = i915_gem_object_wait_rendering(obj, false); > >> + if (ret) > >> + return ret; > >> + } > >> > >> if (i915_is_ggtt(vma->vm) && > >> vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) { > >> @@ -3274,6 +3276,16 @@ int i915_vma_unbind(struct i915_vma *vma) > >> return 0; > >> } > >> > >> +int i915_vma_unbind(struct i915_vma *vma) > >> +{ > >> + return __i915_vma_unbind(vma, true); > >> +} > >> + > >> +int __i915_vma_unbind_no_wait(struct i915_vma *vma) > >> +{ > >> + return __i915_vma_unbind(vma, false); > >> +} > >> + > >> int i915_gpu_idle(struct drm_device *dev) > >> { > >> struct drm_i915_private *dev_priv = dev->dev_private; > >> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c > >> index 74aa0c9929ba..680b4c9f6b73 100644 > >> --- a/drivers/gpu/drm/i915/i915_gem_context.c > >> +++ b/drivers/gpu/drm/i915/i915_gem_context.c > >> @@ -133,6 +133,23 @@ static int get_context_size(struct drm_device *dev) > >> return ret; > >> } > >> > >> +static void i915_gem_context_clean(struct intel_context *ctx) > >> +{ > >> + struct i915_hw_ppgtt *ppgtt = ctx->ppgtt; > >> + struct i915_vma *vma, *next; > >> + > >> + if (WARN_ON_ONCE(!ppgtt)) > >> + return; > > > > [ 80.242892] [drm:i915_gem_open] > > [ 80.250485] ------------[ cut here ]------------ > > [ 80.258938] WARNING: CPU: 1 PID: 277 at ../drivers/gpu/drm/i915/i915_gem_context.c:141 i915_gem_context_free+0xef/0x27b [i915]() > > [ 80.275344] WARN_ON_ONCE(!ppgtt) > > Ha, I did not see how ctx->ppgtt == NULL could exist but wanted to be > extra safe since i915_ppgtt_put checks for that. Well, how can it exit? -- Ville Syrjälä Intel OTC _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx