Re: [PATCH v4] drm/i915: Clean up associated VMAs on context destruction

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Tue, 6 Oct 2015 10:34:48 +0100



On Tue, Oct 06, 2015 at 11:28:49AM +0200, Daniel Vetter wrote:
> On Tue, Oct 06, 2015 at 10:04:31AM +0100, Chris Wilson wrote:
> > On Tue, Oct 06, 2015 at 10:58:01AM +0200, Daniel Vetter wrote:
> > > On Mon, Oct 05, 2015 at 01:26:36PM +0100, Tvrtko Ursulin wrote:
> > > > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> > > > 
> > > > Prevent leaking VMAs and PPGTT VMs when objects are imported
> > > > via flink.
> > > > 
> > > > Scenario is that any VMAs created by the importer will be left
> > > > dangling after the importer exits, or destroys the PPGTT context
> > > > with which they are associated.
> > > > 
> > > > This is caused by object destruction not running when the
> > > > importer closes the buffer object handle due the reference held
> > > > by the exporter. This also leaks the VM since the VMA has a
> > > > reference on it.
> > > > 
> > > > In practice these leaks can be observed by stopping and starting
> > > > the X server on a kernel with fbcon compiled in. Every time
> > > > X server exits another VMA will be leaked against the fbcon's
> > > > frame buffer object.
> > > > 
> > > > Also on systems where flink buffer sharing is used extensively,
> > > > like Android, this leak has even more serious consequences.
> > > > 
> > > > This version is takes a general approach from the  earlier work
> > > > by Rafael Barbalho (drm/i915: Clean-up PPGTT on context
> > > > destruction) and tries to incorporate the subsequent discussion
> > > > between Chris Wilson and Daniel Vetter.
> > > > 
> > > > v2:
> > > > 
> > > > Removed immediate cleanup on object retire - it was causing a
> > > > recursive VMA unbind via i915_gem_object_wait_rendering. And
> > > > it is in fact not even needed since by definition context
> > > > cleanup worker runs only after the last context reference has
> > > > been dropped, hence all VMAs against the VM belonging to the
> > > > context are already on the inactive list.
> > > > 
> > > > v3:
> > > > 
> > > > Previous version could deadlock since VMA unbind waits on any
> > > > rendering on an object to complete. Objects can be busy in a
> > > > different VM which would mean that the cleanup loop would do
> > > > the wait with the struct mutex held.
> > > > 
> > > > This is an even simpler approach where we just unbind VMAs
> > > > without waiting since we know all VMAs belonging to this VM
> > > > are idle, and there is nothing in flight, at the point
> > > > context destructor runs.
> > > > 
> > > > v4:
> > > > 
> > > > Double underscore prefix for __915_vma_unbind_no_wait and a
> > > > commit message typo fix. (Michel Thierry)
> > > > 
> > > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> > > > Testcase: igt/gem_ppgtt.c/flink-and-exit-vma-leak
> > > > Reviewed-by: Michel Thierry <michel.thierry@xxxxxxxxx>
> > > > Cc: Daniel Vetter <daniel.vetter@xxxxxxxx>
> > > > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > > > Cc: Rafael Barbalho <rafael.barbalho@xxxxxxxxx>
> > > > Cc: Michel Thierry <michel.thierry@xxxxxxxxx>
> > > 
> > > Queued for -next, thanks for the patch.
> > 
> > Please no, it's an awful patch and does not even fix the root cause of
> > the leak (that the vma are not closed when the handle is).
> 
> It's a lose-lose situation for me as maintainer (and this holds in general
> really, not just for this patch):
> - Either I wield my considerable maintainer powers and force proper
>   solution, which will piss of a lot of people.
> - Or I just merge intermediate stuff and piss of another set of people
>   (including likely our all future selves because we're slowly digging a
>   tech debt grave).
> 
> What I can't do with maintainer fu is force collaboration, I can only try
> to piss off everyone equally. And today the die rolled a "merge".

Pity it didn't roll "what's the impact and where is the bugzilla?" :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx